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POLYNUCLEOTIDES CONTROLLING THE EXPRESSION OF AND CODING FOR GENE B IN TOMATO 

5 EIELD AND BACKGROU ND O F T HE INVENTION 

The present invention relates to a novel polynucleotide sequences 
isolated from tomato and, more particularly, to a novel lycopene cyclase 
gene and novel control elements controlling its specific expression in 
chromogenic tissues of plants, e.g., fruit and flower. 

10 Carotenoids - functions and biosynthesis: Carotenoids comprise 

one of the largest classes of pigments in nature. In photosynthetic 
organisms carotenoids serve two major functions - as accessory pigments 
for light harvesting, and as protective agents against photooxidation 
processes in the photosynthetic apparatus. Another important role of 

15 carotenoids in plants, as well as in some animals, is that of providing 
distinctive pigmentation. Most of the orange, yellow, or red colors found in 
the flowers, fruits and other organs of many higher plant species are due to 
accumulation of carotenoids in the cells. 

The biosynthesis of carotenoids has been reviewed extensively 

20 (Britton, 1988; Sandmann, 1994a). Carotenoids are produced from the 
general isoprenoid biosynthetic pathway, which in plants takes place in the 
chloroplasts of photosynthetic tissues and chromoplasts of fruits and 
flowers. 

The first unique step in carotenoid biosynthesis is the head- to-head 
25 condensation of two molecules of geranylgeranyl pyrophosphate (GGPP) to 
produce phytoene (Figure 1). All the subsequent steps in the pathway occur 
in association with membranes. Four desaturation (dehydrogenation) 
reactions convert phytoene to lycopene via phytofluene, (^-carotene, and 
neurosporene, as intermediates. Two cyclization reactions convert lycopene 
30 to p-carotene (Figure 1). Further reactions involve the addition of various 
oxygen-containing side groups which form the various xanthophyll species 
(not shown). 

It has been established in recent years that four enzymes in plants 
catalyze the biosynthesis of P-carotene from GGPP: phytoene synthase, 
35 phytoene desaturase, (^-carotene desaturase and lycopene cyclase (reviewed 
in Sandmann, 1994b). All enzymes in the pathway are nuclear encoded. 
Genes for phytoene synthase and phytoene desaturase have been previously 
cloned from tomato (Ray et al., 1992; Pecker et al., 1992). 
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The red color of ripe tomatoes is provided by lycopene, a linear 
carotene which accumulates during fruit ripening as membrane-bound 
crystals in chromoplasts (Laval-Martin et-aL, 1975). It is presumed to serve 
as an attractant of predators that eat the fruit and disperse the seeds. 

5 Accumulation of lycopene begins at the "breaker" stage of fruit ripening 
after the fruit has reached the "mature green" stage. In the "breaker" stage, 
which is indicated by the commencement of color change from green to 
orange, chlorophyll is degraded and chloroplasts turn into chromoplasts 
(Gillaspy et al., 1993; Grierson and Schuch, 1993). Total carotenoid 

10 concentration increases between 10 to 15-fold during the transition from 
"mature green" to "red". This change is due mainly to a 300-fold increase 
in lycopene (Fraser et al., 1994). 

The cDNA which encodes lycopene p-cyclase, CrtL-b, was cloned 
from tomato {Lycopersicon esculentum cv. VF36) and tobacco (Nicotiana 

15 tabacum cv. Samsun NN, Pecker et al., 1996, U.S. Pat. application No. 
08/399,561 and PCT/US96/03044 (WO 96/28014) both are incorporated by 
reference as if fully set forth herein) and was functionally expressed in 
Escherichia coli. This enzyme converts lycopene to p-carotene by 
catalyzing the formation of two (3-rings, one at each end of the linear 

20 carotene. The enzyme interacts with half of the carotenoid molecule and 
requires a double bond at the C-7,8 (or C-7,8') position. Inhibition 
experiments in E. coli indicated that lycopene cyclase is the target site for 
the inhibitor 2-(4-methylphenoxy)tri-ethylamine hydrochloride (MPTA, 
Pecker et al., 1996). The primary structure of lycopene cyclase in higher 

25 plants is significantly conserved with the enzyme from cyanobacteria but 
differs from that of the non-photosynthetic bacteria Erwinia (Pecker et al., 
1996). Levels of mRNAs of CrtL-b and Pds, which encodes phytoene 
desaturase, were measured in leaves, flowers and ripening fruits of tomato. 
In contrast to genes that encode enzymes of early steps in the carotenoid 

30 biosynthesis pathway, whose transcription increases during the "breaker" 
stage of fruit ripening, the level of CrtL-b mRNA decreases at this stage 
(Pecker et al., 1996). Hence, the accumulation of lycopene in tomato fruits 
is apparently due to a down-regulation of the lycopene cyclase gene that 
occurs at the breaker stage of fruit development. This conclusion supports 

35 the hypothesis that transcriptional regulation of gene expression is a 
predominant mechanism of regulating carotenogenesis. 

The search for tissue specific control elements in plants is on going, 
however, only limited number of tissue specific control elements capable of 
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specifically directing gene expression in chromogenic tissues (fruit, flower) 
have so far been isolated. These include the promoters of the genes E4 and 
E8 (Montgomery et al., 1993), which are up-regulated by increase in 
ethylene concentration during tomato fruit ripening, the tomato gene 2A1 1 
5 gene (Van Haaren and Houck, 1991) and the polygalacturonase (PG) gene 
(Nicholass et al., 1995; Montgomery et aL, 1993), which are upregulated in 
tomato fruits during ripening. 

There is thus a widely recognized need for, and it would be highly 
advantageous to have, a novel tissue specific control elements capable of 
10 specifically directing gene expression in chromogenic tissues. 

The search for structural genes encoding enzymes associated with 
carotenogenesis is ongoing, and every new gene isolated not only provides 
insight into carotenogenesis, but also provides a tool to control and modify 
carotenogenesis for commercial purposes (Hirschberg et al. 1997, 
15 Cunningham FX Jr. and Gantt B, 1998). 

There is thus a widely recognized need for, and it would be highly 
advantageous to have, a novel lycopene cyclase capable of altering the 
composition of carotenoids in carotenoids producing organisms. 

20 S UMMARY OF THE I NVENTI O N 

According to one aspect of the present invention there is provided an 
isolated complementary or genomic DNA segment comprising a nucleotide 
sequence coding for a polypeptide having an amino acid sequence selected 
from the group consisting of SEQ ID NOs: 17, 18 and 19 and functional 
25 naturally occurring and man-induced variants thereof, with the provision 
that the polypeptide has a major lycopene cyclase catalytic activity. 

According to further features in preferred embodiments of the 
invention described below, the nucleotide sequence is selected from the 
group consisting of SEQ ID NOs: 8, 9, 10 and 1 1 and functional naturally 
30 occurring and man-induced variants thereof. 

According to still further features in the described preferred 
embodiments the nucleotide sequence is a cDNA or a genomic DNA 
isolated form tomato. 

According to another aspect of the present invention there is 
35 provided a polypeptide comprising an amino acid sequence selected from 
the group consisting of SEQ ID NOs: 17, 18 and 19 and functional naturally 
occurring and man-induced variants thereof, the polypeptide having a major 
lycopene cyclase catalytic activity. 
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According to another aspect of the present invention there is 
provided a transduced cell overexpressing a polypeptide including an amino 
acid sequence selected from the group consisting of SEQ ID NOs: 17,18 
and 1 9 and functional naturally occurring and man-induced variants thereof, 
5 the polypeptide having a major lycopene cyclase catalytic activity, the cell 
therefore over producing p-carotene on an expense of lycopene. 

According to still further features in the described preferred 
embodiments the transduced cell is selected from the group consisting of a 
prokaryotic cell and a eukaryotic celL 
10 According to still further features in the described preferred 

embodiments the eukaryotic cell is of a higher plant. 

According to still further features in the described preferred 
embodiments the cell forms a part of a transgenic plant. 

According to yet another aspect of the present invention there is 
15 provided a method of down-regulating production of P-carotene in a cell 
comprising the step of introducing into the cell at least one anti-sense 
polynucleotide sequence capable of base pairing with messenger RNA 
coding for a polypeptide including an amino acid sequence selected from 
the group consisting of SEQ ID NOs: 17, 18 and 19 and functional naturally 
20 occurring and man-induced variants thereof, the polypeptide having a major 
lycopene cyclase catalytic activity, the cell therefore under producing p- 
carotene from lycopene. 

According to still further features in the described preferred 
embodiments the at least one anti-sense polynucleotide sequence includes a 
25 synthetic oligonucleotide. 

According to still further features in the described preferred 
embodiments the synthetic oligonucleotide includes a man-made 
modification rendering the synthetic oligonucleotide more stable in cell 
environment. 

30 According to still further features in the described preferred 

embodiments the synthetic oligonucleotide is selected from the group 
consisting of methylphosphonate oligonucleotide, monothiophosphate 
oligonucleotide, dithiophosphate oligonucleotide, phosphoramidate 
oligonucleotide, phosphate ester oligonucleotide, bridged phosphorothioate 

35 oligonucleotide, bridged phosphoramidate oligonucleotide, bridged 
methylenephosphonate oligonucleotide, dephospho internucleotide analogs 
with siloxane bridges, carbonate bridge oligonucleotide, carboxymethyl 
ester bridge oligonucleotide, carbonate bridge oligonucleotide, 
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carboxymethyl ester bridge oligonucleotide, acetamide bridge 
oligonucleotide, carbamate bridge oligonucleotide, thioether bridge 
oligonucleotide, sulfoxy bridge oligonucleotide, sulfono bridge 
oligonucleotide and a-anomeric bridge oligonucleotide. 
5 According to still further features in the described preferred 

embodiments the at least one anti-sense polynucleotide sequence is encoded 
by an expression vector. 

According to still further features in the described preferred 
embodiments the cell is selected from the group consisting of a prokaryotic 
10 cell and a eukaryotic cell. 

According to still further features in the described preferred 
embodiments the eukaryotic cell is of a higher plant. 

According to still further features in the described preferred 
embodiments the cell forms a part of a transgenic plant. 
15 According to still another aspect of the present invention there is 

provided an expression construct for directing an expression of a gene in 
fruit or flower comprising a regulatory sequence selected from the group 
consisting of an upstream region of a B allele of tomato and an upstream 
region of a b allele of tomato. 
20 According to still further features in the described preferred 

embodiments the expression construct comprising a functional part of 
nucleotides 1-1210 of SEQ ID NO: 14 or nucleotides 1-1600 of SEQ ID 
NO: 15, or functional naturally occurring and man-induced variants thereof. 

According to still further features in the described preferred 
25 embodiments the expression construct comprising at least one control 
element having a sequence selected from the group consisting of SEQ ID 
NOs:2 1-24, all derived from SEQ ID NO: 11, and functional naturally 
occurring and man-induced variants thereof. 

According to still further features in the described preferred 
30 embodiments the expression construct is selected from the group consisting 
of plasmid, cosmid, phage, virus, bacmid and artificial chromosome. 

According to still further features in the described preferred 
embodiments the expression construct is designed to integrate into a 
genome of a host. 

35 According to yet another aspect of the present invention there is 

provided a transduced cell or transgenic plant transduced with the above 
described expression construct. 
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According to still another aspect of the present invention there is 
provided a method of isolating a gene encoding a polypeptide having an 
amino acid sequence homologous to SEQ ID NOs: 17, 18 and 19 and 
having a major lycopene cyclase catalytic activity from a species, the 
5 method comprising the step of screening a complementary or genomic DNA 
library prepared from isolated RNA or genomic DNA extracted from the 
species with a probe having a sequence derived from SEQ ID NOs: 8, 9, 10 
or 1 1 and isolating clones reacting with the probe. 

The present invention successfully addresses the shortcomings of the 
10 presently known configurations by providing novel polynucleotides 
controlling the expression of genes in fruit and flower in plant and a novel 
polynucleotide encoding lycopene cyclase. 

RRTFF DESCRIPTION OF THE DRAWINGS 

15 The invention herein described, by way of example only, with 

reference to the accompanying drawings, wherein: 

FIG. 1 presents the pathway of carotenoid biosynthesis in plants and 
algae. Enzymes are indicated by the their gene assignment symbols: aba2, 
zeaxanthin epoxidase; CrtL-b, Lycopene (J-cyclase ; CrtL-e, lycopene e- 

20 cyclase; CrtR-b, p-ring hydroxylase; CrtR-e, s-ring hydroxylase; Pds 9 
phytoene desaturase (crtP in cyanobacteria); Psy, phytoene synthase (crtB 
in cyanobacteria); Zds, ^-carotene desaturase {crtQ) in cyanobacteria. 
GGDP, geranylgeranyl diphosphate. 

FIG. 2 shows fine genetic mapping and molecular organization of B 

25 on chromosome 6 of the tomato linkage map. The linkage map was adopted 
from Eshed and Zamir (1995). The relevant chromosomal segments from 
L. pennellii that were introgressed to L. esculentum lines IL 6-2 and IL 6-3 
are represented by black bars. High-resolution genetic map around B is 
displayed with genetic distances in map units (cM). Positions of the YAC 

30 inserts are designated under the map. 

FIG. 3 demonstrates levels of mRNA (relative units) during fruit 
ripening of wild-type tomato L. esculentum . Data are derived from 
quantifying the DNA products in the RT-PCR analysis of total RNA 
extracted at different stages of fruit development. Ripening stages: IG, 

35 immature green; MG, mature green, B, breaker, O, Orange; P, pink; R, red. 

FIG. 4 demonstrates levels of mRNA (relative units) during fruit 
ripening of the tomato mutant High-beta. Data are derived from 
quantifying the DNA products in the RT-PCR analysis of total RNA 



WO 00/08920 



PCT/US99/1L8327 



7 

extracted at different stages of fruit development. Ripening stages: G, green; 
MG, mature green, B, breaker, O, Orange; P, pink; R, red. 

DF.SfTR TPTT ON OF THR PREFERRED EMBODIMENTS 
5 The present invention is of novel polynucleotide sequences isolated 

from tomato which can be used to control gene expression in plant 
chromogenic tissues, especially fruit and flower. The present invention is 
further of polynucleotide sequences isolated from tomato which encode a 
lycopene cyclase which can be used to alter carotenogenesis is carotenoids 

10 producing organisms. 

The principles and operation of the present invention may be better 
understood with reference to the drawings and accompanying descriptions. 

Before explaining at least one embodiment of the invention in detail, 
it is to be understood that the invention is not limited in its application to the 

15 details of construction and the arrangement of the components set forth in 
the following description or illustrated in the drawings. The invention is 
capable of other embodiments or of being practiced or carried out in various 
ways. Also, it is to be understood that the phraseology and terminology 
employed herein is for the purpose of description and should not be 

20 regarded as limiting. 

Fruit of the cultivated tomato (Lycopersicon esculentum) accumulate 
lycopene, a red carotenoid pigment. A dominant allele of gene B 
determines accumulation of p-carotene in the fruits of the tomato mutant 
* high-beta at the expense of lycopene, resulting in a unique orange color. 

25 Conversion of lycopene to p-carotene in the biosynthesis pathway of 
carotenoids is catalyzed by the enzyme lycopene p-cyclase. Previously it 
was shown that CrtL-b, the gene for lycopene p-cyclase, does not map to 
the locus B in the tomato genetic map. This ruled out the possibility that a 
mutation in lycopene p-cyclase encoded by CrtL-b causes the phenotype in 

30 high-beta. 

The locus B was mapped to chromosome No. 6. The dominant allele 
B was found in the tomato introgression line IL 6-2. The DNA of B was 
identified and cloned by a map-based (positional) cloning method. The 
nucleotide sequence of this gene was determined and demonstrated a novel 
35 type of a lycopene cyclase enzyme. Its primary structure has some 
similarity to other lycopene cyclases and to the enzyme capsanthin- 
capsorubin synthase from pepper. In addition, nucleotide sequence was 
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identified, which functions as a strong promoter during fruit development in 
the B allele of the mutant High-beta. 

Thus, according to one aspect of the present invention there is 
provided an isolated complementary or genomic DNA segment comprising 
a nucleotide sequence coding for a polypeptide having an amino acid 
sequence selected from the group consisting of SEQ ID NOs: 17, 18 and 19 
and functional naturally occurring and man-induced variants thereof. The 
polypeptide has a major lycopene cyclase catalytic activity. Polypeptides 
which share at least 70, 75, 80, 85, 90, 95 or more identical amino acid 
residues with SEQ ID NOs: 17, 18 or 19 are also within the scope of the 
present invention. 

As used herein in the specification and in the claims section below, 
the phrase "major lycopene cyclase catalytic activity" refers to catalytic 
activity mainly directed at the conversion of lycopene to p-carotene by 
catalyzing the formation of two P-rings, one at each end of the linear 
carotene, such that if introduced into lycopene-accumulating E. coli cells, 
such cells accumulate also p-carotene up to a range of at least few percent 
e.g., 5 %, to preferably about 15 %, or more, of total carotenoids therein by 
symmetric formation of two p-ionone rings on the linear lycopene 
molecules therein. 

According to a preferred embodiment of the invention the nucleotide 
sequence is as set forth in SEQ ID NOs: 8, 9, 10 or 11, or functional 
naturally occurring or man-induced variants thereof. As further shown 
below these sequences are genomic and complementary DNA sequences 
which were derived while reducing the present invention to practice from 
certain tomato cultivars or lines. However, nucleotide sequences which 
share 70, 75, 80, 85, 90, 95 or more identical nucleotides with SEQ ID NOs: 
8, 9, 10 or 1 1 are also within the scope of the present invention. 

According to another aspect of the present invention there is 
provided a polypeptide comprising an amino acid sequence selected from 
the group consisting of SEQ ID NOs: 17, 18 and 19 and functional naturally 
occurring and man-induced variants thereof, the polypeptide having a major 
lycopene cyclase catalytic activity. Homologous polypeptides as describe 
above and further detailed hereinunder are also envisaged. 

According to another aspect of the present invention there is 
provided a transduced cell overexpressing a polypeptide including an amino 
acid sequence selected from the group consisting of SEQ ID NOs: 17, 18 
and 19, and functional naturally occurring and man-induced variants 
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thereof, the polypeptide having a major lycopene cyclase catalytic activity, 
the cell therefore over producing p-carotene on an expense of lycopene. 

The cell according to the present invention can be of any type. For 
example, the cell can be a prokaryotic cell or a eukaryotic cell. Preferably 
5 the cell is of a higher plant. The cell preferably forms a part of a transgenic 
plant. Methods of transducing cells (and cells in organisms to form 
transgenic organisms) are well known in the art and do not require further 
description herein. Protocols are available, for example, in (Sambrook et 
al., 1989). 

10 As used herein in the specification and in the claims section below, 

the term "transduced" refers to the result of a process of inserting nucleic 
acids into cells. The insertion may, for example, be effected by 
transformation, viral infection, injection, transfection, gene bombardment, 
electroporation or any other means effective in introducing nucleic acids 

15 into cells. Following transduction the nucleic acid is either integrated in all 
or part, to the cell's genome (DNA), or remains external to the cell's 
genome, thereby providing stably transduced or transiently transduced cells. 

According to yet another aspect of the present invention there is 
provided a method of down-regulating production of p-carotene in a cell 

20 comprising the step of introducing into the cell at least one anti-sense 
polynucleotide sequence capable of base pairing with messenger RNA 
coding for a polypeptide including an amino acid sequence selected from 
the group consisting of SEQ ID NOs: 17, 18 and 19 and functional naturally 
occurring and man-induced variants thereof, the polypeptide having a major 

25 lycopene cyclase catalytic activity, the cell therefore under producing p- 
carotene from lycopene. Again, the cell can be of any type. For example, 
the cell can be a prokaryotic cell or a eukaryotic cell. Preferably the cell is 
of a higher plant. The cell preferably forms a part of a transgenic plant. 

As used herein in the specification and in the claims section below, 

30 the term "down regulating" means also reducing, lowering, inhibiting, etc., 
e.g., permanently or transiently reducing. 

As used herein in the specification and in the claims section below, 
the term "production" means also formation or generation. 

As used herein in the specification and in the claims section below, 

35 the term "introducing" means also providing with or inserting. 

The at least one anti-sense polynucleotide sequence according to the 
present invention can includes one or several synthetic oligonucleotides 
capable of base pairing with messenger RNA derived from the above- 
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identified nucleotide sequences. The synthetic oligonucleotide preferably 
includes a man-made modification rendering the synthetic oligonucleotide 
more stable in cell environment. The modified oligonucleotide can be, for 
example, a methylphosphonate oligonucleotide, monothiophosphate 

5 oligonucleotide, dithiophosphate oligonucleotide, phosphoramidate 
oligonucleotide, phosphate ester oligonucleotide, bridged phosphorothioate 
oligonucleotide, bridged phosphoramidate oligonucleotide, bridged 
methylenephosphonate oligonucleotide, dephospho internucleotide analogs 
with siloxane bridges, carbonate bridge oligonucleotide, carboxymethyl 

10 ester bridge oligonucleotide, carbonate bridge oligonucleotide, 
carboxymethyl ester bridge oligonucleotide, acetamide bridge 
oligonucleotide, carbamate bridge oligonucleotide, thioether bridge 
oligonucleotide, sulfoxy bridge oligonucleotide, sulfono bridge 
oligonucleotide or an a-anomeric bridge oligonucleotide. For further 

15 details the reader is referred to Cook (1991). 

Alternatively, the anti-sense polynucleotide sequence is encoded by 
an anti-sense expression vector. Such vectors are well known in the art and 
are commercially available from, for example, pBHOl, pBI121, pBI221 
(commercially available from Colntech.) 

20 Further according to the present invention, there is provided an 

expression construct for directing an expression of a gene in fruit or flower 
of a plant. The expression vector according to the present invention 
includes a regulatory sequence selected from the group consisting of an 
upstream region of a B allele of tomato and an upstream region of a b allele 

25 of tomato. Thus, according to a preferred embodiment of the invention, the 
expression construct includes a functional part of nucleotides 1-1210 of 
SEQ ID NO: 14 or nucleotides 1-1600 of SEQ ID NO: 15, or functional 
naturally occurring and man-induced variants thereof. 

According to a preferred embodiment, the expression construct 

30 includes at least one control element having a sequence selected from the 
group consisting of SEQ ID NOs: 21-24, all derived from SEQ ID NO:ll, 
and functional naturally occurring and man-induced variants thereof. 

As further detailed in the Examples section hereinbelow, these 
sequence elements, which are 26, 13, 9, and 8 bp long and start at (5 f end) 

35 nucleotides 859, 753, 479 and 306, respectively, of SEQ ID NOs: 11, 15, 
are located upstream to the initiator methionine codon in the B allele are the 
main difference between the B and b allele, and are therefore responsible 
for the differential expression of the B locus in tomato. 
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The expression construct according to the present invention can be a 
plasmid, cosmid, phage, virus, bacmid or an artificial chromosome. Each of 
these constructs has unique sequences rendering the construct most 
applicable for some as opposed to other applications, as well known in the 
5 art. Regardless of its type, according to a preferred embodiment of the 
present invention the expression construct is designed to integrate into a 
genome of a host, such that stable transfectants are obtainable. However, 
the scope of the present invention is not limited to such constructs. In other 
words, constructs designed for transient transfection are also within the 
10 scope of the present invention. In any case, the construct preferably 
includes at least one positive and/or negative selection gene, and is suitable 
for transformation, transfection, transgenization and gene knock-in 
procedures. 

According to yet another aspect of the present invention there is 

15 provided a transduced cell or a transgenic plant transduced with the above 
described expression construct. Such a cell or plant is expressing the gene 
located downstream to the regulatory sequence in a controlled 
developmental manner, mimicking the expression of the lycopene cyclase 
gene of the B locus in b or B tomato plants. 

20 According to still another aspect of the present invention there is 

provided a method of isolating a gene encoding a polypeptide having an 
amino acid sequence homologous to SEQ ID NOs: 17, 18 and 19 and 
having a major lycopene cyclase catalytic activity from a species. The 
method is effected by executing the following method steps, in which a 

25 complementary or genomic DNA library prepared from isolated RNA or 
genomic DNA extracted from the species is screened with a probe having a 
sequence derived from SEQ ID NOs: 8, 9, 10 or 1 1 and clones reacting with 
the probe are isolated. Such clones are good candidates to include segments 
of genes homologous to SEQ ID NOs: 8, 9, 10 or 11, which genes are good 

30 candidates to encode a polypeptide having an amino acid sequence 
homologous to SEQ ID NOs: 17, 18 and 19. 5 f cloning strategies, such as, 
but not limited to RACE protocols can be employed to isolate full length 
clones, as well known in the art. 

Thus, according to the present invention, the following uses of gene 

35 B of tomato are anticipated: 

(i) Increasing the content of f}-carotene in tissues of transgenic 
plants over-expressing it. This is an advantageous attribute in fruits and 
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vegetables because it will provide better nutritional value and enhanced 
color. 

(ii) Increasing the accumulation of lycopene in fruits and flowers 
of transgenic plants by reducing the activity of B using anti-sense inhibition, 
preferably via anti-sense expression. 

(iii) Achieving strong expression of transgenes specifically in 
fruits and flowers using the promoter sequence of the gene B from High- 
beta tomato cultivars. 

Each of the various and aspects of the present invention as delineated 
hereinabove and as claimed in the claims section below finds experimental 
support in the Examples section that follows. 

EXAMPLES 

15 Bacteria and plants: E. coli strain XL 1 -Blue was used in all 

experiments described herein. Tomato (Lycopersicon esculentum) CV M82 
served as the 'wild-type 1 strain in the fruit ripening measurements. The 
introgression lines IL 6-2 and IL 6-3 (Eshed and Zamir, 1994) were used as 
a source for the B mutation and employed for fine mapping of the B locus. 

20 Fine mapping and cloning of the B locus: As a source to B 

mutation, the lines IL-6-2 or IL-6-3 (BB) were used (Eshed and Zamir, 
1995). Each line was crossed with the cultivated tomato cv M-82 (Jbb), and 
the hybrids were selfed to create an F-2 population that segregated for both 
the B phenotype and the introgressed DNA segment. 1335 F-2 plants were 

25 scored for the RFLP using markers CT193 and TG578 (Pnueli et al., 1998; 
Tanksley et al., 1992) and for the B phenotype, and recombinant plants were 
collected. The 32 resulting recombinants were further screened with all the 
available RFLP probes surrounding B to accurately map the mutated locus 
(Figure 2). One RFLP marker, TM16 (Pnueli et al., 1998), was co- 

30 segregated with B in less than 0.0375 cM resolution. 

The tomato genomic library in YACs was screened with DNA of 
markers TM16 and TG275. Two overlapping YAC clones, designated 271 
and 310, were identified by hybridization. DNA sequences from the ends of 
the inserts in these YACs were amplified by PCR as previously described 

35 (Pnueli et al., 1998) and were used as molecular probes to screen the 32 
recombinant plants for Restriction Fragment Length Polymorphism (RFLP). 
The YAC ends were mapped as shown in Figure 2. It was established that 
YAC 310 overlaps the B locus, thus ensured that the 200 kb insert of YAC 
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310 contains the B gene. In contrast, recombination between the left end of 
YAC 271 (27 lie) and the B phenotype indicated that this YAC clone did 
not carry the B locus and defined its location in a relatively small region of 
YAC 3 1 0 that did not overlap with YAC 27 1 (Figure 2). 
5 The DNA insert of YAC 310 was cut with EcoRI and the resulting 

fragments were subcloned in the vector A,-gtll. Two phage clones 
designated Bl and B3, co-segregated with the B locus and mapped to the 
end of YAC 310. The nucleotide sequence of the insert of Bl was 
determined. The Bl fragment was further used to screen a genomic library 

10 of wild-type tomato (cv VF36) in the lambda vector EMBL3, and a cosmid 
library of L. pennellii. A single positive phage clone and a single positive 
cosmid clone were isolated, respectively. 

The Bl fragment was also used to screen 1.5 million plaques of a 
cDNA library from a tomato fruit and 3 identical clones were isolated. The 

is ca. 1300 bp inserts in these clones contained an open reading frame that was 
lacking the 5* end, as determined by nucleotide sequence analysis. The full- 
length cDNAs were then obtained using reverse-transcription polymerase 
chain reaction (RT-PCR) method with RNA isolated from wild-type (VF- 
36) and high-beta (IL 6-3)flowers and fruits. For the PCR reaction we used 

20 5' primers based on the genomic sequence taken from the sequence of Bl 
insert and the 3 ! primers based on the cloned cDNA. The full coding region 
of the cDNA of the allele b of wild type tomato (cv. VF-36) and the allele B 
from L. pennellii were excised in pBluescript KS- vector which were 
designated pBESC and pBPENN, respectively. DNA sequence comparison 

25 between cDNA and genomic sequences revealed no introns interference in 
the genomic sequence of the b (and B). 

DNA blot hybridization was done according to conventional 
techniques (Sambrook et al., 1989, Eshed and Zamir, 1994) at low 
stringency in a buffer containing 10 x Denharts, 5 x SSC, 50 mM phosphate 

30 buffer (pH-7), 1 % SDS, 50 mg salmon sperm (sheared, autoclaved and 
boiled before adding to the mixture). Filters were washed with 5 x SSC at 
65 °C. 

Genomic DNA of tomato was prepared from 5 grams of leaf as 
previously described (Eshed and Zamir, 1995). 
35 Amplification by the polymerase chain reaction (PCR) method of the 

full length cDNA of the b allele was carried out with the following 
oligonucleotide primers, whose sequence was derived from the genomic 
sequence of the Bl clone (see below): Forward: 5*- 
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A ATGG A AGCTCTTCTC A AGCCT-3 5 (SEQ ID NO:l), Reverse: 5'- 
C AC ATTC AA AGGCTCTCTATCGC-3 ' (SEQ ID NO:2). 

Total RNA was extracted from 1.5 grams of fruit or 0.1 gram of 
flower or leaf tissues as previously described (Pecker et al., 1996). 
5 Measurement of mRNA levels by the reverse transcription followed 

by polymerase chain reaction (RT-PCR) technique was carried out as 
previously described (Pecker et al., 1996) using the following 
oligonucleotides as primers for the PCR reaction. For amplification of the 
gene Psy the following primer were employed: Forwardl: 5- 

10 TCG AG A ACGG ACG ATG-3 1 (SEQ ID NO:3), Forward2 (internal): 5'- 
TGCAGAGAGACAGATG-3' (SEQ ID NO:4) and Reverse: 5'- 
ATTTCATGCTTTATCTTTGAAG-3* (SEQ ID NO:5). 

For amplification of allele B: Forward 5'- 
GCTGAAGTTGAAATTGTTGA-3 1 (SEQ ID NO:6) and Reverse 5'- 

15 TCTCTTCCTCAATAACACTT-3 f (SEQ ID NO:7). 

Sequence analysis: DNA sequence analysis was performed by the 
ABI Prism 377 DNA sequencer (Perkin Elmer) and processed with the ABI 
sequence analysis software. Nucleotide and amino acid sequence analysis 
and comparisons were done using the UWGCG software package. 

20 Plasmids; Plasmid pACCRT-EIB for expressing bacterial 

carotenoid biosynthesis genes in E. coli, was previously described 
(Cunningham et aL, 1993). Plasmid pBESC and pBPENN were constructed 
by inserting an 1666 bp of cDNA of the tomato B allele (from Z. pennellii) 
or b allele (from L, esculentum), respectively, in the EcoRV site of the 

25 plasmid vector pBluescript KS (Stratagene®). 

Pigment extraction and analysis: For extraction of pigments from 
E, coli, aliquots of 2 ml were taken from bacterial suspension cultures. The 
cells were harvested by centrifugation, washed once with water, 
resuspended in 2 ml of acetone and incubated at 65 °C for 10 minutes in the 

30 dark. The samples were centrifuged again at 13,000 g for 5 minutes and the 
acetone supernatant containing the pigments was placed in a clean tube. 
More than 99 % of the carotenoids were extracted by this procedure as 
determined by re-extraction after breaking and grinding the samples. The 
pigment extract was blown to dryness under a stream of N2 and stored at - 

35 20 °C until required for analysis. 

Fruit pigments were extracted from 1.0 gram of fresh tissue. The 
tissue was ground in 2 ml of acetone and incubated at room temperature in 
the dark for 10 minutes. Then, 2 ml of dichloro-methane were added and 
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the samples were agitated until all pigments were transferred to the 
supernatant, which was then filtered. To each sample, 4 ml of ether and 0.4 
ml of 12 % w/v NaCl/H20 were added and the mixture was shaken gently 
until all pigment was transferred to the upper (ether) phase. The ether was 

5 collected, and the pigment extract was blown to dryness under a stream of 
N2 and stored at -20 °C until required for analysis. 

Carotenoids were separated by reverse phase HPLC using a 
Spherisorb ODS-2 column (silica 5 mm 3.2 mm x 250 mm, Phenomenex®). 
Samples of 50 jal of acetone-dissolved pigments were injected to a Waters 

10 600 pump. The mobile phase consisted of acetonitrile:H20 (9:1) - solvent 
A, and 100 % ethyl acetate - solvent B, which were used in a linear gradient 
between A and B for 30 minutes, at flow of 1 ml per minute. Light 
absorption peaks were detected in the range of 200-600 nm using a Waters 
996 photo diode-array detector. All spectra were recorded in the eluting 

15 HPLC solvent, as was the fine absorbance spectral structure. Carotenoids 
were identified by their characteristic absorption spectra and their typical 
retention time, which corresponded to standard compounds of lycopene and 
P-carotene. Peak areas were integrated by the Millennium chromatography 
software (Waters). 

20 

EXPERIMENTAL RESULTS 

The only difference between the high-beta mutant and the wild-type 
tomato is in the fruit color due to accumulation of p-carotene at the expense 

25 of lycopene. Thus, it was logical to assume that this mutation occurred in 
the gene that encodes lycopene-p-cyclase {CrtL-b). However, the CrtL-b 
cDNA that was previously cloned from tomato (Pecker et al., 1996) was 
mapped to 2 loci on chromosomes Nos. 4 and 10, but not on chromosome 6, 
where the B locus was mapped. Even at very low stringency of 

30 hybridization conditions we were unable to detect any hybridization of the 
tomato CrtL-b like sequences on chromosome 6. 

Therefore, the only way to clone the gene B, which is responsible for 
the high-beta phenotype, was to use map-based ("positional") cloning 
techniques. 

35 Fine mapping of the B locus: As a source to the B mutation, the IL- 

6-2 or IL-6-3 (BB) (Eshed and Zamir, 1995) tomato lines were employed. 
Each line was crossed with the cultivated tomato cv. M-82 (bb), and the 
hybrids were selfed to create an F-2 population that segregated for both the 
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B phenotype and the introgressed DNA segment. 1335 F-2 plants were 
scored for the RFLP using markers CT-193 and TG-578, (Pnueli et al., 
1998; Tanksley et al., 1992) and for the B phenotype, and recombinant 
plants were collected. The 32 recombinants collected were further screened 
5 with all the available RFLP probes surrounding B to accurately map the 
mutated locus (Figure 2). One RFLP marker, TM-16 (Pnueli et al., 1998), 
co-segregated with B in less than 0.0375 cM resolution. 

The tomato genomic library in YACs was screened with the DNA 
marker TM-16 as a molecular probe. Two YAC clones, designated 271 and 

10 310, were identified by hybridization. DNA sequences from the ends of the 
inserts in these YACs were amplified by PCR as previously described 
(Pnueli et al., 1998) and were used as molecular probes to screen the 32 
recombinant plants for Restriction Fragment Length Polymorphism (RFLP). 
The YAC ends were mapped as shown in Figure 2. It was established that 

15 YAC 310 overlaps the B locus, thus ensured that the 200 kb insert of YAC 
310 contains the B gene. In contrast, recombination between YAC 271 and 
the B phenotype indicated that this clone did not carry the B locus. 
Moreover, it established that B was residing in a confined small region of 
YAC 310 that did not overlap with YAC 271 (Figure 2). 

20 The DNA insert of YAC 310 was cut with EcoRI and the resulting 

fragments were subcloned in the vector k-gtll. Two phage clones 
designated Bl and B3, co-segregated with the B locus and mapped to the 
end of YAC 310. The nucleotide sequence of the insert of Bl was 
determined. The Bl fragment was further used to screen a genomic library 

25 of wild-type tomato (cv VF36) in the lambda vector EMBL3, and a cosmid 
library of L. pennellii, A single positive phage clone and a single positive 
cosmid clone were isolated, respectively. 

The Bl fragment was also used to screen 1.5 million plaques of 
cDNA library from a tomato fruit and 3 identical clones were isolated. The 

30 ca. 1300 bp inserts in these clones contained an open reading frame that was 
lacking the 5 ! end, as determined by nucleotide sequence analysis. The full- 
length cDNAs were then obtained using reverse-transcription polymerase 
chain reaction (RT-PCR) method with RNA isolated from wild-type (VF- 
36) and high-beta (IL 6-3) flowers and fruits. For the PCR reaction we used 

35 5* primers based on the genomic sequence taken from the sequence of Bl 
insert and the 3' primers based on the cloned cDNA. The full coding region 
of the cDNA of the allele b of wild type tomato (cv. VF-36) and the allele 
B from L. pennellii were excised in pBluescript KS _ vector which were 
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designated pBESC and pBPENN, respectively. DNA sequence comparison 
between cDNA and genomic sequences revealed no introns interference in 
the cDNA sequence. 

Table 1 below summarizes the sequence data with reference to the 
5 sequence listing: 



TABLE 1 



Type 


allele 


Species 


SEQ ID NO: 


cDNA 


b ! 


L. esculentum 


8 


gDNA 


b 


L. esculentum 


9 


cDNA 


B 


L. pennellii 


10 


gDNA 


B 


L. pennellii 


11 


cDNA 


ogC 


L. esculentum 


12 


translated cDNA 


b/B 


L. esculentum 
/ L. pennellii 


13 


translated gDNA 


b 


L. esculentum 


14 


translated gDNA 


B 


L. pennellii 


15 


translated cDNA 


ogC 


L. pennellii 


16 


peptide (translated from cDNA) 


b 


L. esculentum 


17 


peptide (translated from gDNA) 


b 


L. esculentum 


18 


peptide (translated from cDNA) 


B 


L. pennellii 


19 


peptide (translated from cDNA) 


ogC 


L. esculentum 


20 



cDNA = complementary DNA; gDNA = genomic DNA; bp = base pairs; aa 



io = amino acid. 

Cloning and sequence analysis of old-gold-crimson (ogC) 
mutation: The old-gold and crimson are two names given to a well-known 
recessive mutation that was found in the Philippines in 1951 (Butler, 1962 

15 and the SolGenes databases: http:// probe.nal.usda.gov:8300/ cgi- 
in/webace?db = solgenes & class = Locus & object = og; and: http:// 
probe.nal.usda.gov:8300/ cgi-bin/webace?db = solgenes & class = Image & 
object = og%2c + old + gold). The ogC locus was mapped to chromosome 
6. At least 2000 F-2 progenies of a cross between High-beta (BB) and ogC 

20 were screened for B-ogC double mutants and not a single recombinant plant 
was found. That locates B and ogC less than 0.025 cM apart. The ogC 
phenotype is characterized by over accumulation of lycopene, both in fruits 
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and flowers, compare to wild type tomatoes and lack of p-carotene in the 
fruits. 

Cloning the B locus from ogC mutant plants was done by PCR 
method on total genomic DNA extracted from ogC plants using primers that 
5 were based on the sequence of the b allele described herein. Sequence 
analysis of the b-homolog revealed a single base deletion, in the coding 
sequence of b at position 104 from the initiation codon (compare SEQ ID 
NOs: 13 and 16). This deletion created a frame-shift mutation that 
shortened the translatable polypeptide to 56 amino acids. This finding 
10 indicates that the ogC is a null mutation of the normal function of the b 
gene. 

Sequences comparison of alleles in the B locus: Nucleotide 
sequence analysis of the 1666 bp cDNA revealed an open reading frame of 
498 codons, potentially coding for a polypeptide of 498 amino acids with a 

15 calculated molecular mass of 56.4 kDa. Nucleotide sequence analysis 
showed 98% identity between b (from VF-36, SEQ ID NO: 8) and B (from 
L. pennellii, SEQ ID NO: 10). The amino acid sequences of B and b are 
97.4% identical (SEQ ID NOs: 17 and 19). 

In the 1200 bp sequences upstream to the translated region of B from 

20 L. pennellii there are four sequence insertions as compared with the 
equivalent region in b from VF-36. The inserts are 26, 13, 9, and 8 bp long 
and start at (5' end) nucleotides 859, 753, 479 and 306, respectively, of SEQ 
ID NOs: 11, 15. They are located upstream to the initiator methionine 
codon in the B allele are the main difference between the B and b alleles, 

25 and are therefore responsible for the differential expression of the B locus in 
tomato. Their sequences are TGACTTCACCCTTCTTTCTTGTCTTC 
(SEQ ID NO:21), AGAGTCTGGGTTC (SEQ ID NO:22), CTAGTATCG 
(SEQ ID NO:23) and CTAAATAT (SEQ ID NO:24). An additional 
AATTTTCAAA (SEQ ID NO:25) sequence, which is found in upstream 

30 regions of ethylene-activated genes such as E4 and E8 (Montgomery et al., 
1993), is shared by the upstream regions of the B and b alleles. All other 
sequences in the promoter and region are 90-94% conserved in the two 
allele (compare SEQ ID NOs: 9 and 1 1). 

The polypeptide products of B and b are fi-carotene synthases: 

35 The use of E. coli heterologous system for carotenoid biosynthesis has been 
proven to be a powerful tool for identifying genes associated with 
carotenoid biosynthesis. E. coli cells of the strain XLI- Blue, carrying the 
plasmid pACCRT-EIB accumulate lycopene (Cunnungham et al. 1993). 
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Lycopene-accumulating E. coli cells were co-transformed with the plasmid 
pBESC or pBPENN and selected on LB medium containing both ampicillin 
and chloramphenicol. Carotenoids from cells carrying pACCRT-EIB alone, 
or pACCRT-EIB and either pBESC or pBPENN were extracted and 
5 analyzed by HPLC. 

Cells carrying only the pACCRT-EIB plasmid produced lycopene, 
while cells carrying both pACCRT-EIB and pBPENN accumulate also p- 
carotene up to 13 % of total carotenoids. Similarly, cells carrying both 
pACCRT-EIB and pBESC produced p-carotene up to 5 % of total 
10 carotenoids (see Table 2 below). These results indicated that the cDNA- 
products of both the B and b alleles are capable of converting lycopene to p- 
carotene by the symmetric formation of two P-ionone rings on the linear 
lycopene molecule. 

15 TABLE 2 

The B gene product converts lycopene to fi-carotene. Acciamulatioini of 
carotenoids in E. coli cells expressing alleles B or b from tomato 
(percent of total carotenoids) 





plasmid 


lycopene 


p-carotene 






pACCRT-EIB 


100 






25 


pACCRT-EIB 
+ pBESC (b) 


87 


13 




30 


pACCRT-EIB 
+ pBPENN (B) 


95 


5 





Sequence comparison between B and other carotene cyclases: The 
nucleotide sequences of the coding region of b and the coding region of the 
cDNA of the previously published lycopene p-cyclase from tomato, CrtL-b 
(Pecker et al, 1996) , are 59 % identical. The polypeptide products of these 
35 genes are only 52 % identical. These data explain why CrtL-b could not 
hybridize with the sequence of B. Moreover, while the similarity in amino 
acid sequence between B and CRTLB suggests a common mechanism of 
lycopene cyclization, it clearly demonstrates that B is a novel lycopene p- 
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cyclase enzyme. There is no similarity (less than 45 % identities) in the 
non-translated regions of these two genes. 

Surprisingly, the nucleotide sequence of the cDNA of b is 83% 
identical with the cDNA of a gene from bell pepper {Capsicum annuum), 
5 which catalyzes the conversion of the ubiquitous 5,6-epoxycarotenoids, 
antheraxanthin and violaxanthin, into the ketocarotenoids capsanthin and 
capsorubin, respectively (Bouvier et al., 1994). This enzyme, called also 
capsanthin-capsorubin synthase (CCS), is synthesized specifically in pepper 
fruits. There is 85 % identity in the deduced amino acid sequences of B and 
10 CCS. 

Expression of B gene during fruit ripening in wild-type and High- 
beta: Previously, it has been shown that the steady-state levels of mRNA 
of the genes for early enzymes in the carotenoid biosynthesis pathway, 
phytoene synthase and phytoene desaturase, increase during fruit ripening in 

15 tomato (Hirschberg et al., 1997). In the case of Pds it was demonstrated 
that transcriptional up-regulation is responsible for this increase (reviewed 
in Hirschberg et al., 1997). Recently, we have determined that the mRNA 
level of CrtL-b, which encodes lycopene (3-cyclase, decreases during tomato 
fruit ripening (Pecker et al. 1996). 

20 To determine the regulation of expression of B gene during fruit 

development in tomato, we have measured by RT-PCR its mRNA level at 
different stages of fruit development. As can be seen in Figure 3, mRNA of 
the b gene is undetected in leaves and during the green stages of fruit 
ripening of wild-type tomato. However, it is increased at the 'breaker' stage 

25 of ripening but then disappears at later stages of ripening. This marked drop 
of mRNA of B is contrasted by the dramatic increase in mRNA level of Psy 
at the same stages of fruit ripening. 

In contrast to the wild-type tomato, the mRNA level of B in the fruit 
of the High-beta mutant (containing the B allele) increases dramatically at 

30 the 'breaker' stage and remains high during all the subsequent ripening 
stages (Figure 4). These results indicate that the major difference between 
alleles b and B is in the level of expression at different ripening stages. The 
results further explain the phenotype of mutant High-beta, carrying the B 
allele, where a novel type of lycopene cyclase, which is capable of 

35 converting lycopene to p-carotene, is highly expressed during fruit ripening. 

Although the invention has been described in conjunction with 
specific embodiments thereof, it is evident that many alternatives, 
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modifications and variations will be apparent to those skilled in the art. 
Accordingly, it is intended to embrace all such alternatives, modifications 
and variations that fall within the spirit and broad scope of the appended 
claims. 
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WHAT IS CLAIMED IS: 

1. An isolated complementary or genomic DNA segment 
comprising a nucleotide sequence coding for a polypeptide having an amino 
acid sequence selected from the group consisting of SEQ ID NOs: 17, 18 
and 19 and functional naturally occurring and man-induced variants thereof, 
with the provision that said polypeptide has a major lycopene cyclase 
catalytic activity. 

2. The isolated DNA segment of claim 1, wherein said 
nucleotide sequence is selected from the group consisting of SEQ ID NOs: 
8, 9, 10 and 1 1 and functional naturally occurring and man-induced variants 
thereof. 

3. The isolated DNA segment of claim 1, wherein said 
nucleotide sequence is a cDNA or a genomic DNA isolated form tomato. 

4. An isolated complementary or genomic DNA segment 
comprising a nucleotide sequence selected from the group consisting of 
SEQ ID NOs: 8, 9, 10 and 11. 

5. A polypeptide comprising an amino acid sequence selected 
from the group consisting of SEQ ID NOs: 17, 18 and 19 and functional 
naturally occurring and man-induced variants thereof, said polypeptide 
having a major lycopene cyclase catalytic activity. 

6. A transduced cell overexpressing a polypeptide including an 
amino acid sequence selected from the group consisting of SEQ ID NOs: 
17, 18 and 19 and functional naturally occurring and man-induced variants 
thereof, said polypeptide having a major lycopene cyclase catalytic activity, 
the cell therefore over producing p-carotene on an expense of lycopene. 

7. The transduced cell of claim 6, selected from the group 
consisting of a prokaryotic cell and a eukaryotic cell. 

8. The transduced cell of claim 7, wherein said eukaryotic cell is 
of a higher plant. 
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9. The transduced cell of claim 6, wherein the cell forms a part 
of a transgenic plant. 

10. A method of down-regulating production of (3-carotene in a 
cell comprising the step of introducing into the cell at least one anti-sense 
polynucleotide sequence capable of base pairing with messenger RNA 
coding for a polypeptide including an amino acid sequence selected from 
the group consisting of SEQ ID NOs: 17, 18 and 19 and functional naturally 
occurring and man-induced variants thereof, said polypeptide having a 
major lycopene cyclase catalytic activity, the cell therefore under producing 
(3-carotene from lycopene. 

11. The method of claim 10, wherein said at least one anti-sense 
polynucleotide sequence includes a synthetic oligonucleotide. 

12. The method of claim 11, wherein said synthetic 
oligonucleotide includes a man-made modification rendering said synthetic 
oligonucleotide more stable in cell environment. 

13. The method of claim 11, wherein said synthetic 
oligonucleotide is selected from the group consisting of methylphosphonate 
oligonucleotide, monothiophosphate oligonucleotide, dithiophosphate 
oligonucleotide, phosphoramidate oligonucleotide, phosphate ester 
oligonucleotide, bridged phosphorothioate oligonucleotide, bridged 
phosphoramidate oligonucleotide, bridged methylenephosphonate 
oligonucleotide, dephospho internucleotide analogs with siloxane bridges, 
carbonate bridge oligonucleotide, carboxymethyl ester bridge 
oligonucleotide, carbonate bridge oligonucleotide, carboxymethyl ester 
bridge oligonucleotide, acetamide bridge oligonucleotide, carbamate bridge 
oligonucleotide, thioether bridge oligonucleotide, sulfoxy bridge 
oligonucleotide, sulfono bridge oligonucleotide and ot-anomeric bridge 
oligonucleotide. 

14. The method of claim 10, wherein said at least one anti-sense 
polynucleotide sequence is encoded by an expression vector. 

15. The method of claim 10, wherein said cell is selected from the 
group consisting of a prokaryotic cell and a eukaryotic cell. 
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16. The method of claim 15, wherein said eukaryotic cell is of a 
higher plant. 

17. The method of claim 15, wherein the cell forms a part of a 
transgenic plant. 

18. An expression construct for directing an expression of a gene 
in fruit or flower comprising a regulatory sequence selected from the group 
consisting of an upstream region of a B allele of tomato and an upstream 
region of a b allele of tomato. 

19. The expression construct of claim 18, comprising a functional 
part of nucleotides 1-1210 of SEQ ID NO: 14 or nucleotides 1-1600 of SEQ 
ID NO: 15, or functional naturally occurring and man-induced variants 
thereof 

20. The expression construct of claim 18, comprising at least one 
control element having a sequence selected from the group consisting of 
SEQ ID NOs:21-24, all derived from SEQ ID NO:ll, and functional 
naturally occurring and man-induced variants thereof. 

21. The expression construct of claim 18, wherein the expression 
construct is selected from the group consisting of plasmid, cosmid, phage, 
virus, bacmid and artificial chromosome. 

22. The expression construct of claim 18, designed to integrate 
into a genome of a host. 

23. A method of isolating a gene encoding a polypeptide having 
an amino acid sequence homologous to SEQ ID NOs: 17, 18 and 19 and 
having a major lycopene cyclase catalytic activity from a species, the 
method comprising the step of screening a complementary or genomic DNA 
library prepared from isolated RNA or genomic DNA extracted from said 
species with a probe having a sequence derived from SEQ ID NOs: 8, 9, 10 
or 1 1 and isolating clones reacting with said probe. 
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24. A transduced cell transduced with the expression construct of 
claim 18. 

25. A transgenic plant transduced with the expression construct of 
claim 18. 
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SEQUENCE LISTING 



(1) 



GENERAL INFORMATION: 



(i> 
<ii) 



Uii) 
(iv) 



(v) 



APPLICANT: 

TITLE OF INVENTION : 



NUMBER OF SEQUENCES : 
CORRESPONDENCE ADDRESS: 



(A) 
(B) 
(C) 
(D) 
(E) 
(F) 



ADDRESSEE : 
STREET : 
CITY: 
STATE: 
COUNTRY : 
ZIP: 



COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: 

(B) COMPUTER: 

(C) OPERATING SYSTEM: 

<D) SOFTWARE: 



Joseph Hirschberg et al. 

POLYNUCLEOTIDES CONTROLLING THE EXPRESSION 
OF AND CODING FOR GENE B IN TOMATO AND USE 
OF SAME FOR ALTERING CAROTENOID 
BIOSYNTHESIS 
25 

Mark M. Friedman c/o Anthony Castorina 
20001 Jefferson Davis Highway, Suite 207 
Arlington 
Virginia 

United States of America 
22202 

1.44 megabyte, 3.5" microdisk 
Twinhead, Slimnote 8 90TX 

MS DOS version 6.2, 
Windows version 3.11 
Word for Windows version 2.0, 



converted to ASCI 

<vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 

<B) FILING DATE: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Friedmam, 

(B) REGISTRATION NUMBER: 
<C) REFERENCE/DOCKET NUMBER: 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 

(B) TELEFAX: 

(C) TELEX: 



Mark M. 
33, 883 
325/12 



972-3-5625553 
972-3-5625554 



(2) 



INFORMATION FOR SEQ ID NO : 1 : 



(i) 



(xi) 



SEQUENCE CHARACTERISTICS: 



(A) 
(B) 
(C) 
<D> 



LENGTH : 
TYPE: 

STRANDEDNESS : 
TOPOLOGY : 



SEQUENCE DESCRIPTION: 



22 

nucleic acid 

single 

linear 

SEQ ID NO:l: 



AATGGAAGCT CTTCTCAAGC CT 22 



(2) 



INFORMATION FOR SEQ ID NO : 2 : 



(i) 



(xi) 



SEQUENCE CHARACTERISTICS: 



(A) 
(B) 
(C) 
<D) 



LENGTH : 
TYPE: 

STRANDEDNESS : 
TOPOLOGY : 



SEQUENCE DESCRIPTION: 



23 

nucleic acid 

single 

linear 

SEQ ID NO: 2: 



CACATTCAAA GGCTCTCTAT CGC 23 
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(2) INFORMATION FOR SEQ ID NO : 3 : 

(X) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
TCGAGAACGG ACGATG 16 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 4: 

TG CAGAG AG A CAGATG 16 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

ATTTCATGCT TTATCTTTGA AG 22 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 

GCTGAAGTTG AAATTGTTGA 20 



(2) 



INFORMATION FOR SEQ ID NO : 7 : 



(i) 



SEQUENCE CHARACTERISTICS: 



(A) 
(B) 
(C) 
(D) 



LENGTH : 
TYPE: 

STRANDEDNESS : 
TOPOLOGY : 



(xi) SEQUENCE DESCRIPTION: 
TCTCTTCCTC AATAACACTT 20 



20 

nucleic acid 

single 

linear 

SEQ ID NO: 7: 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1666 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

ATGGAAGCTC TTCTCAAGCC TTTTCCATCT CTTTTACTTT CCTCTCCTAC 50 

ACCCCATAGG TCTATTTTCC AACAAAATCC CTCTTTTCTA AGTCCCACCA 100 

CCAAAAAAAA ATCAAGAAAA TGTCTTCTTA GAAACAAAAG TAGTAAACTT 150 

TTTTGTAGCT TTCTTGATTT AGCACCCACA TCAAAGCCAG AGTCTTTAGA 200 

TGTTAACATC TCATGGGTTG ATCCTAATTC GAATCGGGCT CAATTCGACG 250 

TGATCATTAT CGGAGCTGGC CCTGCTGGGC TCAGGCTAGC TGAACAAGTT 300 
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TCTAAATATG GTATTAAGGT ATGTTGTGTT GACCCTTCAC CACTCTCCAT 3 50 
GTGGCCAAAT AATTATGGTG TTTGGGTTGA TGAGTTTGAG AATTTAGGAC 4 00 
TGGAAAATTG TTTAGATCAT AAATGGCCTA TGACTTGTGT GCATATAAAT 4 50 
GATAACAAAA CTAAGTATTT GGGAAGACCA TATGGTAGAG TTAGTAGAAA 500 
GAAGCTGAAG TTGAAATTGT TGAATAGTTG TGTTGAGAAC AGAGTGAAGT 550 
TTTATAAAGC TAAGGTTTGG AAAGTGGAAC ATGAAGAATT TGAGTCTTCA 600 
ATTGTTTGTG ATGATGGTAA GAAGATAAGA GGTAGTTTGG TTGTGGATGC 650 
AAGTGGTTTT GCTAGTGATT TTATAGAGTA TGACAGGCCA AGAAACCATG 700 
GTTATCAAAT TGCTCATGGG GTTTTAGTAG AAGTTG ATAA TCATCCATTT 750 
GATTTGGATA AAATGGTGCT TATGGATTGG AGGGATTCTC ATTTGGGTAA 800 
TGAGCCATAT TTAAGGGTGA ATAATGCTAA AGAACCAACA TTCTTGTATG 8 50 
CAATG CCATT TGATAGAGAT TTGGTTTTCT TGGAAGAGAC TTCTTTGGTG 900 
AGTCGTCCTG TTTTATCGTA TATGGAAGTA AAAAGAAGGA TGGTGGCAAG 950 
ATTAAGGCAT TTGGGGATCA AAGTGAAAAG TGTTATTGAG GAAGAGAAAT X000 
GTGTGATCCC TATGGGAGGA CCACTTCCGC GGATTCCTCA AAATGTTATG 1050 
GCTATTGGTG GGAATTCAGG GATAGTTCAT CCATCAACAG GGTACATGGT 1100 
GGCTAGGAGC ATGGCTTTAG CACCAGTACT AGCTGAAGCC ATCGTCGAGG 1150 
GGCTTGGCTC AACAAGAATG AT AAGAGGG T CTCAACTTTA CCATAGAGTT 1200 
TGGAATGGTT TGTGGCCTTT GGATAGAAGA TGTGTTAGAG AATGTTATTC 12 50 
ATTTGGGATG GAGACATTGT TGAAGCTTGA TTTGAAAGGG ACTAGGAGAT 1300 
TGTTTGACGC TTTCTTTGAT CTTGATCCTA AATACTGGCA AGGGTTCCTT 1350 
TCTTCAAGAT TGTCTGTCAA AGAACTTGGT TTACTCAGCT TGTGTCTTTT 14 00 
CGGACATGGC TCAAACATGA CTAGGTTGGA TATTGTTACA AAATGTCCTC 14 50 
TTCCTTTGGT TAGACTGATT GG CAATCT AG CAATAGAGAG CCTTTGAATG 1500 
TGAAAAGTTT GAATCATTTT CTTCATTTTA ATTTCTTTGA TTATTTTCAT 1550 
ATTTTCTCAA TTGCAAAAGT GAGATAAGAG CTACATACTG TCAAC AAATA . 1600 
AACTACTATT GGAAAGTTAA AATATGTGTT TGTTGTATGT TATTCTAATG 1650 
GAATGGATTT TGTAAA 16 66 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2876 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

GAATTCTCTG AAAAGG AG C A CCATATTTGC CGCACTGTGG TTCATATTTC 50 

CAAGTACATT TAGATGAACT ATATCATCAG ATTGAAAGGT TATTGTATAA 100 

TCAATCCAGT GGATTCTCGT TCTGGCACCT TTAGAAGTAC ATGTGCGGAA 150 

AAGAATGATA AGGTTTGTAT TGTTGTTGAC AAAGCCTGTT GCCTTTCTCA 200 

TTTGTAAATG TTCTGAACGA CTCCTAAATT ACTCTTAAGG TGTAAGGTCT 250 

TCCGTGCCTG TTTGTAAATA TAATGCTGTG CCGTGACTTA CCTTTTGTAC 300 

CATTTGTTCA AATGTATGGC CTGAACACCA GGGTTGTCAA AAATGTCTCA 350 

TGCCCGTTTT ATTGGTCTGA AAATGGCGTG ATG CCAAATT CTGCCGCTCC 400 

ACAGTGAGCA TTTCGATCTA CTGGAAATTG ACCAACTTAT TTTATCACTT 450 

GATAACTAAA CAAAATCCTA TTAACTTTAA TCATACATTG TATTTATACC 500 

GAAAAATTTA TGCATAACTC ATTAAATTAC CTTTTTTAGC AGTCAAATTC 550 

TAAATCAGTT TCTAATTTAT CAAAATGGCT TTTATAGGGT CCCATTTCCA 600 

CTAATATACC TGCCGTCCAT G C ACTGACT A CAAAACAAAT ACCTCACTAT 650 

GTTTGTTAGT GCTTGGTAAT ATAAAACCTT TTCTTTTATG AGAAAGTTCA 700 

CCGAGAATAA TTTTCTATTT GTGG CAT AAT AGTATATAGT GCAGATTGAC 750 

AAGAATTTAA TTTTGCAGTT GGG CACATG A ACAATTTTCC TCAAAGTTGT 800 

AGAAAGTACT TTTCATTTTC TTGTCACCGA AAATTATTTA TAATTGAAAT 850 

TAAAACCGAA TGAGCTGCAA GATTCAAGTC GAATTTTCAA AAGAATTGAC 900 

CAAGAAAAAA TTCAAAAATA TCCCCCACCC CCTACCAAAC ACATCCTAAA 950 

GTGAGGTATA GACTGGGACT GGGATTGGGA AAAGGGTAAA ATG CTTTC AC 1000 

TAGCTTAGCA AAGATTCCAC TTTGTTAGCT ATCTTTCTTT CTCATTTCCT 1050 

TTTT TCTTTT TCTTTTTTTT GTTATATAAG CCAAAGTAGG TACCCAAAAG 1100 

CATCAATATT TTGTATTGCT TGGTGATTCC TCTGTAGTCC AG T ATTTCAT 1150 

TTTCTACAAG TTCCACCTCC CTCCATAATT AACCATTATC AATCTTATAC 1200 

ATTCTCTATA ATGGAAACTC TTCTCAAGCC TTTTCCATCT CTTTTACTTT 1250 

CCTCTCCTAC ACCCCATAGG TCTATTTTCC AACAAAATCC CTCTTTTCTA 1300 

AGTCCCACCA CCAAAAAAAA ATCAAGAAAA TGTCTTCTTA GAAACAAAAG 1350 

TAGTAAACTT TTTTGT AG CT TTCTTGATTT AGCACCCACA TCAAAGCCAG 14 00 

AG TCTTTAGA TGTTAACATC TCATGGGTTG ATCCTAATTC GAATCGGGCT 14 50 

CAATTCGACG TGATCATTAT CGG AG CTGGC CCTGCTGGGC TCAGGCTAGC 1500 

TGAACAAGTT TCTAAATATG GTATTAAGGT ATGTTGTGTT GACCCTTCAC 1550 

CACTCTCCAT GTGGCCAAAT AATTATGGTG TTTGGGTTGA TGAGTTTGAG 1600 

AATTTAGGAC TGGAAAATTG TTTAGATCAT AAATGGCCTA TGACTTGTGT 1650 

GCATATAAAT GATAACAAAA CTAAGTATTT GGGAAGACCA TATGGTAGAG 1700 

TTAGTAGAAA GAAGCTGAAG TTGAAATTGT TGAATAGTTG TGTTGAGAAC 1750 

AGAGTGAAGT TTTATAAAGC TAAGGTTTGG AAAGTGGAAC ATGAAGAATT 1800 
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TGAGTCTTCA ATTGTTTGTG ATGATGGTAA GAAGATAAGA GGTAGTTTGG 1850 
TTGTGGATGC AAGTGGTTTT GCTAGTGATT TTATAGAGTA TGACAGGCCA 1900 
AGAAACCATG GTTATCAAAT TGCTCATGGG GTTTTAGTAG AAGTTGATAA 1950 
TCATCCATTT GATTTGGATA AAATGGTGCT TATGGATTGG AGGGATTCTC 2000 
ATTTGGGTAA TGAGCCATAT TTAAGGGTGA ATAATGCTAA AGAACCAACA 2050 
TTCTTGTATG CAATGCCATT TGATAGAGAT TTGGTTTTCT TGGAAGAGAC 2100 
TTCTTTGGTG AGTCGTCCTG TTTTATCGTA TATGGAAGTA AAAAGAAGGA 2150 
TGGTGGCAAG ATTAAGGCAT TTGGGGATCA AAGTGAAAAG TGTTATTGAG 22 00 
GAAGAGAAAT GTGTGATCCC TATGGGAGGA CCACTTCCGC GGATTCCTCA 2250 
AAATGTTATG GCTATTGGTG GGAATTCAGG GATAGTTCAT CCATCAACAG 2300 
GGTACATGGT GGCTAGGAGC ATGG CTTTAG CACCAGTACT AGCTGAAGCC 2350 
ATCGTCGAGG GGCTTGGCTC AACAAGAATG ATAAGAGGGT CTCAACTTTA 24 00 
CCATAGAGTT TGGAATGGTT TGTGGCCTTT GGATAGAAGA TGTGTTAGAG 2450 
AATGTTATTC ATTTGGGATG GAGACATTGT TGAAGCTTGA TTTGAAAGGG 2500 
ACTAGGAGAT TGTTTGACGC TTTCTTTGAT CTTGATCCTA AATACTGGCA 2550 
AGGGTTCCTT TCTTCAAGAT TGTCTGTCAA AGAACTTGGT TTACTCAGCT 2600 
TGTGTCTTTT CGGACATGGC TCAAACATGA CTAGGTTGGA TATTGTTACA 2650 
AAATGTCCTC TTCCTTTGGT TAGACTGATT GGCAATCTAG CAATAGAGAG 2700 
CCTTTGAATG TGAAAAGTTT GAATCATTTT CTTCATTTTA ATTTCTTTGA 2750 
TTATTTTCAT ATTTTCTCAA TTGCAAAAGT GAGATAAGAG CTACATACTG 2800 
TCAACAAATA AACTACTATT GGAAAGTTAA AATATGTGTT TGTTGTATGT 2850 
TATTCTAATG GAATGGATTT TGTAAA 2876 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 174 0 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

ATGGAAGCTC TTCTCAAGCC TTTTCC AT CT CTTTTACTTT CCTCTCCTAC 50 
ACCCTATAGG TCTATTGTCC AACAAAATCC TTCTTTTCTA AGTCCCACCA 100 
CCAAAAAAAA TCAAGAAAAT GTCTTCTTAG AAACAAAAGT AGTAAACTTT 150 
TTTGTAGCTT TCTTGATTTA GCACCCACAT CAAAG CCAGA GTCTTTAAAT 20 0 
GTTAACATCT CATGGGTTGA TCCTAATTCG AATCGGGCTC AATTCGACGT 250 
GATCATTATC GGAGCTGGCC CTGCTGGGCT CAGGCTAGCT GAACAAGTTT 3 00 
CTAAATATGG TATTAAGGTA TGTTGTGTTG ACCCTTCACC ACTCTCCATG 350 
TGG CCAAAT A ATTATGGTGT TTGGGTTGAT GAGTTTGAGA ATTTAGGACT 4 00' 
GGAAAATTGT TTAGATCATA AATGGCCTAT GACTTGTGTG CATATAAATG 450 
ATAACAAAAC TAAGTATTTG GGAAGACCAT ATGGTAGAGT TAGTAGAAAG 500 
AAG CTGAAGT TGAAATTGTT GAATAGTTGT GTTGAGAACA GAGTGAAGTT 550 
TTATAAAGCT AAGGTTTGGA AAGTGGAACA TGAAGAATTT GAGTCTTCAA 600 
TTGTTTGTGA TGATGGTAAG AAGATAAGAG GTAGTTTGGT TGTGGATGCA 650 
AGTGGTTTTG CTAGTGATTT TATAGAGTAT GACAGGCCAA GAAACCATGG 700 
TTATCAAATT GCTCATGGGG TTTTAGTAGA AGTTGATAAT CATCCATTTG 750 
ATTTGGATAA AATGGTGCTT ATGGATTGGA GGGATTCTCA TTTGGGTAAT 800 
G AG CC AT ATT TAAGGGTGAA TAATGCTAAA GAACCAACAT TCTTGTATGC 850 
AATGCCATTT GATAGAGATT TGGTTTTCTT GGAAGAGACT TCTTTGGTGA 900 
GTCGTCCTGT GTTATCGTAT ATGGAAGTAA AAAGAAGGAT GGTGGCAAGA 950 
TTAAGGCATT TGGGGATCAA AGTGAAAAGT GTTATTGAGG AAGAGAAATG 1000 
TGTGATCCCT ATGGGAGGAC CACTTCCGCG GATTCCTCAA AATGTTATGG 1050 
CTATTGGTGG GAATTCAGGG ATAGTTCATC CATCAACAGG GTACATGGTG 1100 
GCTAGGAGCA TGG CTTTAG C ACCAGTACTA GCTGAAG CCA TCGTCGAGGG 1150 
GCTTGGCTCA ACAAGAATGA TAAGAGGGTC TCAACTTTAC CATAGAGTTT 1200 
GGAATGGTTT GTGGCCTTTG GATAGAAGAT G TGTT AG AG A ATGTTATTCA 1250 
TTTGGGATGG AGACATTGTT GAAGCTTGAT TTGAAAGGGA CTAGGAGATT 1300 
GTTTGACGCT TTCTTTGATC TTGATCCTAA ATACTGGCAA GGGTTCCTTT 13 50 
CTTCAAGATT GTCTGTCAAA GAAACTTGGT TTACTCAGCT TGTGTCTTTT 14 00 
CGGACATGGC TCAAACATGA CTAGGTTGGG ATATTGTTAC AAAATGTCCT 14 50 
CTTCCTTTGG TTAGACTGAT TGGCAATCTA GCAATAGAGA GCCTTTGAAA 1500 
TGTGAAAAGT TTGAATCATT TTCTTCATTT TAATTTCTTT GATTATTTTC 15 50 
ATATTTTCTC AATTGCAGAA TGAGATAAAA ACTACATACT GTCGACAAAT 1600 
AAACTACTAT TGGAANGTTA AAATAATGTG TGTGTTGNAT GTTANGCCTA 16 50 
ATGGAANGGA TGNGGTTANG CAATTTATGA ACTGNNCGCT CTGTTCG CTT 1700 
AAAANC CTTG GTTCCACCTT AANGGAANGG NCCGG CC ATT 174 0 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2897 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

TGGTTCATAT TTCCAATTAC ATTTAGATGA ACTATATCAT CAGGAGTGAA 50 
AGGTTATTGT ATAATCAATC CAGTGGATTC TCGTTCTGGC ACCTTTAGAA 100 
GTACATGTGC GGAAAAGAAT GATAAGGTTT GTATTGTTGT TGACAAGGCC 150 
TGTTGCCTTT CTCATTTGTA AATGTTCTGA ACGACTCCTA AATTACTCTT 200 
AAAGTG T AAG GTCTTCCGTG CCTGTTTGTA TATATAATGC TGTG CCGTGA 2 50 
CTTACCTTTT GTACCATTTG TTCAAATGTA TGGCCTGGAC ACTAGGGTTG 300 
TCAAAAATGT CTCATGACTT CACCCTTCTT TCTTGTCTTG GTG CCCGTTT 350 
TATTGGTCTG AGAACGGCGT GATGCCAAAT TCTGCCGCTC CACAGTGAGC 400 
ATTTCGATCT ACTGGAAATT GACCAACTTA TTTTATCACT TGATAACTAG 4 50 
AGTCTGGGTT CAAACAAAAT CCAATAACTT CAATCATACA TTGTATTTAT 500 
ATTGAAAAAA TTATGCACAA CTCAGTAAAT TACCTTTTTT TGCAGTCAAA 550 
AATTCTAGAT CAGTTTCTAA TTAATCAAAA TGG CCTTT AT AGGGTCCCAG 600 
TTCCATTAAT ATACCTGCCG TCCATGCACT GATTACAAGA CAAATACCTC 6 50 
ACT ATG TT.TG TTAGTGCTTG GTAATATAAA ACCTTTTCTT TTATGAGAAA 700 
GTTCACCGAA AATAATTTTC TATTTGTGGC ATAACTAGTA TCGAAGTATA 750 
T AGTG C AG AT TGACAAGAAT TTAATTTTGC AGTTGGGCAC ATGAACAATT 800 
TTCCTCAAAG TTGTAGAAAA TATTTTTCAT TTTCTTGTCA CCGAAAATTA 850 
TTTATAATTG AAATTGAAAC CGAATGAGCT GCAAGACTCG AGTCGAATTT 900 
CAAAAAAATT GACCAACTAA ATATGAAAAA ATCCGAATAT ATCCCCCACC 950 
CCCTACCAAA CACATCCTAA AGTGAGGTAT AGACTGGGAC TGGGATTGGG 1000 
AAAAGGGTAA AATGCTTTCA CTAGCTTAGC AAAGATTCCA CTTTGTTAGC 105*0 
TATCTTTCTT TCTCATTTCC TTTTTTCTTT TTCTTTTTTT TGTTATATAA 1100 
GCCAAAGTAG GTACCCAAAA GCATCAATAT TTTGTATTGC TTGGTGATTC 1150 
CTCTTTACTC CAGTATTTCA TTTTCTACAA GTTCCACCTC CCTCCATAAT 1200 
TAACCATTAT CAATCTTATA CATTTTCTAT AATGGAAACT CTTCTCAAGC 1250 
CTTTTCCATC TCTTTTACTT TCCTCTCCTA CACCCTATAG GTCTATTGTC 1300 
CAACAAAATC CTTCTTTTCT AAGTCCCACC ACCCAAAAAA AATCAAG AAA 13 50 
ATGT CTTCTT AGAAACAAAA GTAGTAAACT TTTTTGTAGC TTTCTTGATT 1400 
TAGCACCCAC ATCAAAGCCA GAGTCTTTAA ATGTTAACAT CTCATGGGTT 14 50 
GATCCTAATT CTGGTCGGGC TCAATTCGAC GTGATCATTA TCGGAGCTGG 1500 
CCCTGCTGGG CTCAGGTTAG CTGAACAAGT TTCTAAATAT GGTATTAAGG 1550 
T ATG TTGTGT TGACCCTTCA CCACTCTCCA TGTGGCCAAA TAATTATGGT 1600 
GTTTGGGTTG ATGAGTTTGA GAATTTAGGA CTGGAAGATT GTTTAGATCA 1650 
TAAATGGCCT ATGACTTGTG TGCATATAAA TGATAACAAG ACT AAG T ATT 1700 
TGGGAAGACC ATATGGTAGA GTTAGTAGAA AGAAGCTGAA GTTGAAATTG 1750 
TTGAACAGTT GTGTTGAGAA CAGAGTGAAG TTTTATAAAG CTAAGGTTTG 1800 
GAAAGTGGAA CATGAAGAAT TTGAGTCTTC AATTGTTTGT GATGATGGTA 185 0 
AGAAGATAAG AGGTAGTTTG GTTGTGGATG CAAGTGGTTT TGCTAGTGAT 1900 
TTTATAGAGT ATGACAAGCC AAGAAACCAT GGTTATCAAA TTGCTCATGG 195 0 
GGTTTTAGTA GAAGTTGATA ATCATCCATT TGATTTGGAT AAAATGGTGC 2000 
TTATGGATTG GAGGGATTCT CATTTAGGTA ATGAGCCATA TTTAAGGGTG 2050 
AATAATGCTA AAGAACCAAC ATTCTTGTAT GCAATGCCAT TTGATAGAAA 2100 
TTTGGTTTTC TTGGAAGAGA CTTCTTTGGT GAGTCGTCCT GTGTTATCGT 215 0 
ATATGGAAGT AAAAAGAAGG ATGGTGGCAA GATTAAGGCA TTTGGGGATC 2200. 
AAAGTGAGAA GTGTTATTGA GGAAGAGAAA TGTGTGATCC CTATGGGAGG 2250 
ACCACTTCCG CGGATTCCTC AAAATGTTAT GGCTATTGGT GGGAATTCAG 23 00 
GGATAGTTCA TCCATCAACG GGGTACATGG TGGCTAGGAG CATGGCTTTA 2350 
GCACCAGTAC TAGCTGAAGC CATCGTCGAG GGGCTTGGCT CAACAAGAAT 24 00 
GATAAGAGGG TCTCAACTTT ACCATAGAGT TTGGAATGGT TTGTGGCCTT 24 50 
TGGATAGAAG ATGTGTTAGA GAATGTTATT CATTTGGGAT GGAGACATTG 2500 
TTGAAGCTTG ATTTGAAAGG GACTAGGAGA TTGTTTGACG CTTTCTTTGA 2550 
TCTTGATCCT AAATACTGGC AAGGGTTCCT TTCTTCAAGA TTGTCTGTCA 2600 
AAGAACTTGG TTTACTCAGC TTGTGTCTTT TCGGACATGG CTCAAATTTG 2650 
ACTAGGTTGG ATATTGTTAC AAAATGTCCT GTTCCTTTGG TTAGACTGAT 2700 
TGGCAATCTA GCAGTAGAGA GCCTTTGAAT GTGAAAAGTT TGAATCATTT 2750 
TCTTTATTTT AATTTCTTTG ATTATTTTCA TATTTTCTCA ATGCAAAAGT 2 800 
GAGAGAAGAC TATACACTGT CAACAAATAA ACTACTATTG GAAAGTTAAA 2850. 
ATAATGTGTG TGTTGTATGT T ATG CTAATG GAATGGATTG GTGTAAA 2897 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 174 0 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

. (xi> SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

ATGGAAGCTC TTCTCAAGCC TTTTCCATCT CTTTTACTTT CCTCTCCTAC 50 
ACCCTATAGG TCTATTGTCC AACAAAATCC TTCTTTTCTA AGTCCCACCA 100 
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CCAAAAAAAA TCAAGAAAAT GTCTTCTTAG AAACAAAAGT AGTAAACTTT 150 
TTTGTAGCTT TCTTGATTTA GCACCCACAT CAAAGCCAGA GTCTTTAAAT 200 
GTTAACATCT CATGGGTTGA TCCTAATTCG AATCGGGCTC AATTCGACG T 250 
GATCATTATC GGAGCTGGCC CTGCTGGGCT CAGGCTAGCT GAACAAGTTT 300 
CTAAATATGG TATTAAGGTA TGTTGTGTTG ACCCTTCACC ACTCTCCATG 350 
TGGCCAAATA ATTATGGTGT TTGGGTTGAT GAGTTTGAGA ATTTAGGACT 400 
GGAAAATTGT TTAGATCATA AATGGCCTAT GACTTGTGTG CATATAAATG 4 50 
ATAACAAAAC TAAGTATTTG GGAAGACCAT ATGGTAGAGT TAGTAGAAAG 500 
AAGCTGAAGT TGAAATTGTT GAATAGTTGT GTTGAGAACA GAGTGAAGTT 5-50 
TTATAAAGCT AAGGTTTGGA AAGTGGAACA TGAAGAATTT GAGTCTTCAA 600 
TTGTTTGTGA TGATGGTAAG AAGATAAGAG GTAGTTTGGT TGTGGATGCA 650 
AGTGGTTTTG CTAGTGATTT TATAGAGTAT GACAGGCCAA GAAACCATGG 700 
TTATCAAATT GCTCATGGGG TTTTAGTAGA AGTTGATAAT CATCCATTTG 7 50 
ATTTGGATAA AATGGTGCTT ATGGATTGGA GGGATTCTCA TTTGGGTAAT 800 
GAG CC ATATT TAAGGGTGAA TAATGCTAAA GAACCAACAT TCTTGTATGC 850 
AATGCCATTT GATAGAGATT TGGTTTTCTT GGAAGAGACT TCTTTGGTGA 900 
GTCGTCCTGT GTTATCGTAT ATGGAAGTAA AAAGAAGGAT GGTGGCAAGA 950 
TTAAGGCATT TGGGGATCAA AGTGAAAAGT GTTATTGAGG AAGAGAAATG 1000 
TGTGATCCCT ATGGGAGGAC CACTTCCGCG GATTCCTCAA AATGTTATGG 1050 
CTATTGGTGG GAATTCAGGG ATAGTTCATC CATCAACAGG GTACATGGTG 1100 
GCTAGGAGCA TGGCTTTAGC ACCAGTACTA GCTGAAGCCA TCGTCGAGGG 1150 
GCTTGGCTCA ACAAGAATGA TAAGAGGGTC TCAACTTTAC CATAGAGTTT 1200 
GGAATGGTTT GTGGCCTTTG GATAGAAGAT GTGTTAGAGA ATGTTATTCA 1250 
TTTGGGATGG AGACATTGTT GAAGCTTGAT TTGAAAGGGA CTAGGAGATT 1300 
GTTTGACGCT TTCTTTGATC TTGATCCTAA ATACTGGCAA GGGTT CCT TT 1350 
CTTCAAGATT GTCTGTCAAA GAAACTTGGT TTACTCAGCT TGTGTCTTTT 14 00 
CGGACATGGC TCAAACATGA CTAGGTTGGG ATATTGTTAC AAAATGTCCT 14 50 
CTTCCTTTGG TTAGACTGAT TGGCAATCTA GCAATAGAGA GCCTTTGAAA 1500 
TGTGAAAAGT TTGAATCATT TTCTTCATTT TAATTTCTTT GATTATTTTC 1550 
ATATTTTCTC AATTGCAGAA TGAGATAAAA ACTACATACT GTCGACAAAT 1600 
AAACTACTAT TGGAANGTTA AAATAATGTG TGTGTTGNAT GTTANGCCTA 1650 
ATGGAANGGA TGNGGTTANG CAATTTATGA ACTGNNCGCT CTGTTCGCTT 1700 
AAAANCCTTG GTTCCACCTT AANGGAANGG NCCGGCCATT 1740 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1666 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 







(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ ID 


NO: 


13 : 








ATG 


GAA 


GCT 


CTT 


CTC 


AAG 


CCT 


TTT 


CCA 


TCT 


CTT 


TTA 


CTT 


TCC 


TCT 


45 


Met 


GlU 


Ala 


Leu 


Leu 


Lys 


Pro 


Phe 


Pro 


Ser 


Leu 


Leu 


Leu 


Ser 


Ser 












5 








10 










15 




CCT 


ACA 


CCC 


CAT 


AGG 


TCT 


ATT 


TTC 


CAA 


CAA 


AAT 


CCC 


TCT 


TTT 


CTA 


90 


Pro 


Thr 


Pro 


His 


Arg 
20 


Ser 


lie 


Phe 


Gin 


Gin 
25 


Asn 


Pro 


Ser 


Phe 


Leu 
30 




AGT 


ccc 


ACC 


ACC 


AAA 


AAA 


AAA 


TCA 


AGA 


AAA 


TGT 


CTT 


CTT 


AGA 


AAC 


135 


Ser 


Pro 


Thr 


Thr 


Lys 


Lys 


Lys 


Ser Arg 


Lys 


Cys 


Leu 


Leu 


Arg 


Asn 












35 










40 










45 




AAA 


AGT 


AGT 


AAA 


CTT 


TTT 


TGT 


AGC 


TTT 


CTT 


GAT 


TTA 


GCA 


CCC 


ACA 


180 


Lys 


Ser 


Ser 


Lys 


Leu 


Phe 


Cys 


Ser 


Phe 


Leu 


Asp 


Leu 


Ala 


Pro 


Thr 








50 










55 










60 




TCA 


AAG 


CCA 


GAG 


TCT 


TTA 


GAT 


GTT 


AAC 


ATC 


TCA 


TGG 


GTT 


GAT 


CCT. 


225 


Ser 


Lys 


Pro 


GlU 


Ser 


Leu 


Asp 


Val 


Asn 


lie 


Ser 


Trp 


Val 


Asp 


Pro 










65 










70 










75 




AAT 


TCG 


AAT 


CGG 


GCT 


CAA 


TTC 


GAC 


GTG 


ATC 


ATT 


ATC 


GGA 


GCT 


GGC 


270 


Asn 


Ser 


Asn Arg 


Ala 


Gin 


Phe 


Asp 


Val 


He 


He 


He Gly Ala Gly 












80 










85 










90 




CCT 


GCT 


GGG 


CTC 


AGG 


CTA 


GCT 


GAA 


CAA 


GTT 


TCT 


AAA 


TAT 


GGT 


ATT 


315 


Pro 


Ala Gly 


Leu 


Arg 


Leu 


Ala 


Glu 


Gin 


Val 


Ser 


Lys Tyr Gly 


He 












95 










100 










105 




AAG 


GTA 


TGT 


TGT 


GTT 


GAC 


CCT 


TCA 


CCA 


CTC 


TCC 


ATG 


TGG 


CCA 


AAT 


360 


Lys 


Val 


Cys 


Cys 


Val 
110 


Asp 


Pro 


Ser 


Pro 


Leu 
115 


Ser 


Met 


Trp 


Pro 


Asn 
120 




AAT 


TAT 


GGT 


GTT 


TGG 


GTT 


GAT 


GAG 


TTT 


GAG 


AAT 


TTA 


GGA 


CTG 


GAA 


405 


Asn 


Tyr Gly val 


Trp 


Val 


Asp 


Glu 


Phe 


Glu 


Asn Leu Gly Leu Glu 












125 










130 










135 




AAT 


TGT 


TTA 


GAT 


CAT 


AAA 


TGG 


CCT 


ATG 


ACT 


TGT 


GTG 


CAT 


ATA 


AAT 


450 


Asn 


Cys 


Leu 


Asp 


His 


Lys 


Trp 


Pro 


Met 


Thr 


Cys 


Val 


His 


He 


Asn 










140 










145 










150 




GAT 


AAC AAA ACT 


AAG 


TAT 


TTG 


GGA 


AGA 


CCA 


TAT 


GGT 


AGA 


GTT 


AGT 


495 
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ASp 


Asn 


Lys 


Thr 


AGA 


AAG 


AAG 


CTG 


Arg 


Lys 


Lys 


Leu 


AGA 


GTG 


AAG 


TTT 


Arg 


Val 


Lys 


Phe 


GAA 


TTT 


GAG 


TCT 


Glu 


Phe 


Glu 


Ser 


GGT 


AGT 


TTG 


GTT 


Gly 


Ser 


Leu 


Val 


GAG 


TAT 


GAC 


AGG 


Glu 


Tyr 


Asp 


Arg 


GTT 


TTA 


GTA 


GAA 


Val 


Leu 


Val 


Glu 


GTG 


CTT 


ATG 


GAT 


Val 


Leu 


Met 


ASp 


TTA 


AGG 


GTG 


AAT 


Leu 


Arg 


Val 


Asn 


CCA 


.TTT 


GAT 


AGA 


Pro 


Phe 


Asp 


Arg 


AGT 


CGT 


CCT 


GTT 


Ser 


Arg 


Pro 


Val 


GCA 


AGA 


TTA 


AGG 


Ala 


Arg 


Leu 


Arg 


GAA 


GAG 


AAA 




Glu 


Glu 


Lys 


Cys 


CCT 


CAA 


AAT 


GTT 


Pro 


Gin 


Asn 


Val 


CCA 


TCA 


ACA 


GGG 


Pro 


Ser 


Thr 


Gly 


GTA 


CTA 


GCT 


GAA 


val 


Leu 


Ala 


Glu 


ATA 


AGA 


GGG 


TCT 


lie 


Arg 


Gly 


Ser 


CCT 


TTG 


GAT 


AGA 


Pro 


Leu 


Asp 


Arg 


GAG 


ACA 






Glu 


Thr 


Leu 


Leu 


GAC 


GCT 


TTC 


TTT 


Asp 


Ala 


Phe 


Phe 


TCT 


TCA 


AGA 


TTG 


Ser 


Ser 


Arg 


Leu 


CTT 


TTC 


GGA 


CAT 


Leu 


Phe 


Gly 


His 


AAA 


TGT 


CCT 


CTT 


Lys 


Cys 


Pro 


Leu 


GAG 


AGC 


CTT 


TGA 


Glu 


Ser 


Leu 








496 




ATT 


TCT 


TTG 


ATT 


AAG 


AGC 


TAC 


ATA 


AAT 


ATG 


TGT 


TTG 


A 









Lys 


Tyr 


Leu 


Gly 


155 








AAG 


TTG 


AAA 


TTG 


Lys 


Leu 


Lys 


Leu 


170 








TAT 


AAA 


GCT 


AAG 


Tyr 


Lys 


Ala 


Lys 


185 








TCA 


ATT 


GTT 


TGT 


Ser 


He 


Val 


Cys 


200 








GTG 


GAT 


GCA 


AGT 


Val 


Asp 


Ala 


Ser 


215 








CCA 


AGA 


AAC 


CAT 


Pro 


Arg 


Asn 


His 


230 








GTT 


GAT 


AAT 


CAT 


Val 


Asp 


Asn 


His 


245 








TGG 


AGG 


GAT 


TCT 


Trp 


Arg 


Asp 


Ser 


260 








AAT 


GCT 


AAA 


GAA 


Asn 


Ala 


Lys 


Glu 


275 








GAT 


TTG 


GTT 


TTC 


Asp 


Leu 


Val 


Phe 


290 








TTA 


TCG 


TAT 


ATG 


Leu 


Ser 


Tyr 


Met 


305 








CAT 


TTG 


GGG 


ATC 


His 


Leu 


Gly 


He 


320 








GTG 


ATC 


CCT 


ATG 


Val 


He 


Pro 


Met 


335 








ATG 


GCT 


ATT 


GGT 


Met 


Ala 


He 


Gly 


350 








TAC 


ATG 


GTG 


GCT 


Tyr 


Met 


val 


Ala 


365 








GCC 


ATC 


GTC 


GAG 


Ala 


He 


val 


Glu 


380 








CAA 


CTT 


TAC 


CAT 


Gin 


Leu 


Tyr 


His 


395 








AGA 


TGT 


GTT 


AGA 


Arg 


Cys 


Val 


Arg 


410 








AAG 


CTT 


GAT 


TTG 


Lys 


Leu 


ASp 


Leu 


425 








GAT 


CTT 


GAT 


CCT 


Asp 


Leu 


Asp 


Pro 


440 








TCT 


GTC 


AAA 


GAA 


Ser 


Val 


Lys 


Glu 


455 








GGC 


TCA 


AAC 


ATG 


Gly 


Ser 


Asn 


Met 


470 








CCT 


TTG 


GTT 


AGA 


Pro 


Leu 


Val 


Arg 


485 








ATG 


TGA 


AAA 


GTT 


ATT 


TTC 


ATA 


TTT 


CTG 


TCA 


ACA 


AAT 


TTG 


TAT 


GTT 


ATT 



7 



Arg 


Pro 


Tyr 


Gly 




160 






TTG 


AAT 


AGT 


TGT 


Leu 


Asn 


Ser 


Cys 




175 






GTT 


TGG 


AAA 


GTG 


Val 


Trp 


Lys 


Val 




190 






GAT 


GAT 


GGT 


AAG 


Asp 


Asp 


Gly 


Lys 




205 






GGT 


TTT 


GCT 


AGT 


Gly 


Phe 


Ala 


Ser 




220 






GGT 


TAT 


CAA 


ATT 


Gly 


Tyr 


Gin 


He 




235 






CCA 


TTT 


GAT 


TTG 


Pro 


Phe 


Asp 


Leu 




250 






CAT 


TTG 


GGT 


AAT 


His 


Leu 


Gly 


Asn 




265 






CCA 


ACA 


TTC 


TTG 


Pro 


Thr 


Phe 


Leu 




280 






TTG 


GAA 


GAG 


ACT 


Leu 


Glu 


Glu 


Thr 




295 






GAA 


GTA 


AAA 


AGA 


Glu 


Val 


Lys 


Arg 




310 






AAA 


GTG 


AAA 


AGT 


Lys 


Val 


Lys 


Ser 




325 






GGA 


GGA 


CCA 


CTT 


Gly 


Gly 


Pro 


Leu 




340 






GGG 


AAT 


TCA 


GGG 


Gly 


Asn 


Ser 


Gly 




355 






AGG 


AGC 


ATG 


GCT 


Arg 


Ser 


Met 


Ala 




370 






GGG 


CTT 


GGC 


TCA 


Gly 


Leu 


Gly 


Ser 




385 






AGA 


GTT 


TGG 


AAT 


Arg 


Val 


Trp 


Asn 




400 






GAA 


TGT 


TAT 


TCA 


Glu 


Cys 


Tyr 


Ser 




415 






AAA 


GGG 


ACT 


AGG 


Lys 


Gly 


Thr 


Arg 




430 






AAA 


TAC 


TGG 


CAA 


Lys 


Tyr 


Trp 


Gin 




445 






CTT 


GGT 


TTA 


CTC 


Leu 


Gly 


Leu 


Leu 




460 






ACT 


AGG 


TTG 


GAT 


Thr 


Arg 


Leu 


Asp 




475 






CTG 


ATT 


GGC 


AAT 


Leu 


He 


Gly 


Asn 




490 






TGA 


ATC 


ATT 


TTC 


TCT 


CAA 


TTG 


CAA 


AAA 


CTA 


CTA 


TTG 


CTA 


ATG 


GAA 


TGG 



Arg 


Val 


Ser 








165 




GTT 


GAG 


AAC 


540 


Val 


Glu 


Asn 








180 




GAA 


CAT 


GAA 


585 


Glu 


His 


Glu 








195 




AAG 


ATA 


AGA 


630 


Lys 


He 


Arg 








210 




GAT 


TTT 


ATA 


675 


Asp 


Phe 


He 








225 




GCT 


CAT 


GGG 


720 


Ala 


His 


Gly 








240 




GAT 


AAA 


ATG 


765 


Asp 


Lys 


Met 








255 




GAG 


CCA 


TAT 


810 


Glu 


Pro 


Tyr 








270 




TAT 


GCA 


ATG 


855 


Tyr 


Ala 


Met 








285 




TCT 


TTG 


GTG 


900 


Ser 


Leu 


Val 








300 




AGG 


ATG 


GTG 


945 


Arg 


Met 


Val 








315 




GTT 


ATT 


GAG 


990 


Val 


He 


Glu 








330 




CCG 


CGG 


ATT 


1035 


Pro 


Arg 


He 








345 




ATA 


GTT 


CAT 


1080 


He 


Val 


His 








36Q 




TTA 


GCA 


CCA 


1125 


Leu 


Ala 


Pro 








375 




ACA 


AGA 


ATG 


1170 


Thr 


Arg 


Met 








390 




GGT 


TTG 


TGG 


1215 


Gly 


Leu 


Trp 








405 




TTT 


GGG 


ATG 


1260 


Phe 


Gly 


Met 








420 




AGA 


TTG 


TTT 


1305 


Arg 


Leu 


Phe 








435 




GGG 


TTC 


CTT 


1350 


Gly 


Phe 


Leu 








450 




AGC 


TTG 


TGT 


1395 


Ser 


Leu 


Cys 








465 




ATT 


GTT 


ACA 


1440 


He 


Val 


Thr 








480 




CTA 


GCA 


ATA 


1485 


Leu 


Ala 


He 








495 




TTC 


ATT 


TTA 


1530 


AAG 


TGA 


GAT 


1575 


GAA 


AGT 


TAA 


1620 


ATT 


TTG 


TAA 


1665 



1666 
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(2) INFORMATION FOR SEQ ID NO:14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2876 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: 

G AAT TCT CTG AAA AGG AGC ACC ATA TTT GCC GCA CTG TGG TTC 4 3 
ATA TTT CCA AGT ACA TTT AGA TGA ACT ATA TCA TCA GAT TGA AAG 88 
GTT ATT GTA TAA TCA ATC CAG TGG ATT CTC GTT CTG GCA CCT TTA 133 
GAA GTA CAT GTG CGG AAA AGA ATG ATA AGG TTT GTA TTG TTG TTG 178 
ACA AAG CCT GTT GCC TTT CTC ATT TGT AAA TGT TCT GAA CGA CTC 223 
CTA AAT TAC TCT TAA GGT GTA AGG TCT TCC GTG CCT GTT TGT AAA 268 
TAT AAT GCT GTG CCG TGA CTT ACC TTT TGT ACC ATT TGT TCA AAT 313 
GTA TGG CCT GAA CAC CAG GGT TGT CAA AAA TGT CTC ATG CCC GTT 358 
TTA TTG GTC TGA AAA TGG CGT GAT GCC AAA TTC TGC CGC TCC ACA 403 
GTG AGC ATT TCG ATC TAC TGG AAA TTG ACC AAC TTA TTT TAT CAC 44 8 
TTG ATA ACT AAA CAA AAT CCT ATT AAC TTT AAT CAT ACA TTG TAT 4 93 
TTA TAC CGA AAA ATT TAT GCA TAA CTC ATT AAA TTA CCT TTT TTA 53 8 
GCA GTC AAA TTC TAA ATC AGT TTC TAA TTT ATC AAA ATG GCT TTT 583 
ATA GGG TCC CAT TTC CAC TAA TAT ACC TGC CGT CCA TGC ACT GAC 628 
TAC AAA ACA AAT ACC TCA CTA TGT TTG TTA GTG CTT GGT AAT ATA 673 
AAA CCT TTT CTT TTA TGA GAA AGT TCA CCG AGA ATA ATT TTC TAT 718 
TTG TGG CAT AAT AGT ATA TAG TGC AGA TTG ACA AGA ATT TAA TTT 763 
TGC AGT TGG GCA CAT GAA CAA TTT TCC TCA AAG TTG TAG AAA GTA 808 
CTT TTC ATT TTC TTG TCA CCG AAA ATT ATT TAT AAT TGA AAT TAA 853 
AAC CGA ATG AGC TGC AAG ATT CAA GTC GAA TTT TCA AAA GAA TTG 898 
ACC AAG AAA AAA TTC AAA AAT ATC CCC CAC CCC CTA CCA AAC ACA 94 3 
TCC TAA AGT GAG GTA TAG ACT GGG ACT GGG ATT GGG AAA AGG GTA 988 
AAA TGC TTT CAC TAG CTT AGC AAA GAT TCC ACT TTG TTA GCT ATC 1033 
TTT CTT TCT CAT TTC CTT TTT TCT TTT TCT TTT TTT TGT TAT ATA 1078 
AGC CAA AGT AGG TAC CCA AAA GCA TCA ATA TTT TGT ATT GCT TGG 1123 
TGA TTC CTC TGT AGT CCA GTA TTT CAT TTT CTA CAA GTT CCA CCT 1168 
CCC TCC ATA ATT AAC CAT TAT CAA TCT TAT ACA TTC TCT ATA ATG 1213 

Met 

GAA ACT CTT CTC AAG CCT TTT CCA TCT CTT TTA CTT TCC TCT CCT 1258 
Glu Thr Leu Leu Lys Pro Phe Pro Ser Leu Leu Leu Ser Ser Pro 
5 10 15 

ACA CCC CAT AGG TCT ATT TTC CAA CAA AAT CCC TCT TTT CTA AGT 1303 
Thr Pro His Arg Ser He Phe Gin Gin Asn Pro Ser Phe Leu Ser 
20 25 30 

CCC ACC ACC AAA AAA AAA TCA AGA AAA TGT CTT CTT AGA AAC AAA 134 8 
Pro Thr Thr Lys Lys Lys Ser Arg Lys Cys Leu Leu Arg Asn Lys 
35 40 45 

AGT AGT AAA CTT TTT TGT AGC TTT CTT GAT TTA GCA CCC ACA TCA 13 93 
Ser Ser Lys Leu Phe Cys Ser Phe Leu Asp Leu Ala Pro Thr Ser 
50 55 60 

AAG CCA GAG TCT TTA GAT GTT AAC ATC TCA TGG GTT GAT CCT AAT 143 8 
Lys Pro Glu Ser Leu Asp Val Asn He Ser Trp Val Asp Pro Asn 
65 70 75 

TCG AAT CGG GCT CAA TTC GAC GTG ATC ATT ATC GGA GCT GGC CCT 14 83 
Ser Asn Arg Ala Gin Phe Asp Val He He He Gly Ala Gly Pro 
80 85 90 

GCT GGG CTC AGG CTA GCT GAA CAA GTT TCT AAA TAT GGT ATT AAG 1528 
Ala Gly Leu Arg Leu Ala Glu Gin Val Ser Lys Tyr Gly He Lys 
95 100 105 

GTA TGT TGT GTT GAC CCT TCA CCA CTC TCC ATG TGG CCA AAT AAT 1573 
Val Cys Cys Val Asp Pro Ser Pro Leu Ser Met Trp Pro Asn Asn 
110 H5 120 

TAT GGT GTT TGG GTT GAT GAG TTT GAG AAT TTA GGA CTG GAA AAT 1618 
Tyr Gly Val Trp Val Asp Glu Phe Glu Asn Leu Gly Leu Glu Asn 
125 130 135 

TGT TTA GAT CAT AAA TGG CCT ATG ACT TGT GTG CAT ATA AAT GAT 1663 
Cys Leu Asp His Lys Trp Pro Met Thr Cys Val His He Asn Asp 
140 145 150 

AAC AAA ACT AAG TAT TTG GGA AGA CCA TAT GGT AGA GTT AGT AGA 1708 
Asn Lys Thr Lys Tyr Leu Gly Arg Pro Tyr Gly Arg Val Ser Arg 
155 160 165 

AAG AAG CTG AAG TTG AAA TTG TTG AAT AGT TGT GTT GAG AAC AGA 1753 
Lys Lys Leu Lys Leu Lys Leu Leu Asn Ser Cys Val Glu Asn Arg ■ 
170 175 180 
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GTG AAG TTT TAT 
Val Lys Phe Tyr 

TTT GAG TCT TCA 
Phe Glu Ser Ser 

AGT TTG GTT GTG 
Ser Leu Val Val 

TAT GAC AGG CCA 
Tyr Asp Arg Pro 

TTA GTA GAA GTT 
Leu Val Glu val 

CTT ATG GAT TGG 
Leu Met Asp Trp 

AGG GTG AAT AAT 
Arg Val Asn Asn 

TTT GAT AGA GAT 
Phe Asp Arg Asp 

CGT CCT GTT TTA 
Arg Pro Val Leu 

AGA TTA AGG CAT 
Arg Leu Arg His 

GAG AAA TGT GTG 
Glu Lys Cys Val 

CAA AAT GTT ATG 
Gin Asn Val Met 

TCA ACA GGG TAC 
Ser Thr Gly Tyr 

CTA GCT GAA GCC 
Leu Ala Glu Ala 

AGA GGG TCT CAA 
Arg Gly Ser Gin 

TTG GAT AGA AGA 
Leu Asp Arg Arg 

ACA TTG TTG AAG 
Thr Leu Leu Lys 

GCT TTC TTT GAT 
Ala Phe Phe Asp 

TCA AGA TTG TCT 
Ser Arg Leu Ser 

TTC GGA CAT GGC 
Phe Gly His Gly 

TGT CCT CTT CCT 
Cys Pro Leu Pro 

AGC CTT TGA ATG 
Ser Leu 
498 

TCT TTG ATT ATT 
AGC TAC ATA CTG 
ATG TGT TTG TTG 



AAA 


GCT 


AAG 


GTT 


Lys 


Ala 


Lys 


Val 


185 








ATT 


GTT 


TGT 


GAT 


lie 


Val 


Cys 


Asp 


200 








GAT 


GCA 


AGT 


GGT 


Asp 


Ala 


Ser 


Gly 


215 








AGA 


AAC 


CAT 


GGT 


Arg 


Asn 


His 


Gly 


230 








GAT 


AAT 


CAT 


CCA 


Asp 


Asn 


His 


Pro 


245 








AGG 


GAT 


TCT 


CAT 


Arg 


Asp 


Ser 


His 


260 








GCT 


AAA 


GAA 


CCA 


Ala 


Lys 


Glu 


Pro 


275 








TTG 


GTT 


TTC 


TTG 


Leu 


Val 


Phe 


Leu 


290 








TCG 


TAT 


ATG 


GAA 


Ser 


Tyr 


Met 


Glu 


305 








TTG 


GGG 


ATC 


AAA 


Leu 


Gly 


He 


Lys 


320 








ATC 


CCT 


ATG 


GGA 


lie 


Pro 


Met 


Gly 


335 








GCT 


ATT 


GGT 


GGG 


Ala 


He 


Gly 


Gly 


350 








ATG 


GTG 


GCT 


AGG 


Met 


Val 


Ala 


Arg 


365 








ATC 


GTC 


GAG 


GGG 


lie 


Val 


Glu 


Gly 


380 








CTT 


TAC 


CAT 


AGA 


Leu 


Tyr 


His 


Arg 


395 








TGT 


GTT 


AGA 


GAA 


Cys 


Val 


Arg 


Glu 


410 








CTT 


GAT 


TTG 


AAA 


Leu 


Asp 


Leu 


Lys 


425 








CTT 


GAT 


CCT 


AAA 


Leu 


Asp 


Pro 


Lys 


440 








GTC 


AAA 


GAA 


CTT 


Val 


Lys 


Glu 


Leu 


4 55 








TCA 


AAC 


ATG 


ACT 


Ser 


Asn 


Met 


Thr 


470 








TTG 


GTT 


AGA 


CTG 


Leu 


Val 


Arg 


Leu 


485 








TGA 


AAA 


GTT 


TGA 


TTC 


ATA 


TTT 


TCT 


TCA 


ACA 


AAT 


AAA 


TAT 


GTT 


ATT 


CTA 



9 



TGG 


AAA 


GTG 


GAA 


Trp 


Lys 


Val 


Glu 




190 






GAT 


GGT 


AAG 


AAG 


Asp 


Gly 


Lys 


Lys 




205 






TTT 


GCT 


AGT 


GAT 


Phe 


Ala 


Ser 


Asp 




220 






TAT 


CAA 


ATT 


GCT 


Tyr 


Gin 


He 


Ala 




235 






TTT 


GAT 


TTG 


GAT 


Phe 


Asp 


Leu 


Asp 




250 






TTG 


GGT 


AAT 


GAG 


Leu 


Gly 


Asn 


Glu 




265 






ACA 


TTC 


TTG 


TAT 


Thr 


Phe 


Leu 


Tyr 




280 






GAA 


GAG 


ACT 


TCT 


Glu 


Glu 


Thr 


Ser 




295 






GTA 


AAA 


AGA 


AGG 


Val 


Lys 


Arg 


Arg 




310 






GTG 


AAA 


AGT 


GTT 


Val 


Lys 


Ser 


Val 




325 






GGA 


CCA 


CTT 


CCG 


Gly 


Pro 


Leu 


Pro 




340 






AAT 


TCA 


GGG 


ATA 


Asn 


Ser 


Gly 


He 




355 






AGC 


ATG 


GCT 


TTA 


Ser 


Met 


Ala 


Leu 




370 






CTT 


GGC 


TCA 


ACA 


Leu 


Gly 


Ser 


Thr 




385 






GTT 


TGG 


AAT 


GGT 


val 


Trp 


Asn Gly 




400 






TGT 


TAT 


TCA 


TTT 


Cys 


Tyr 


Ser 


Phe 




415 






GGG 


ACT 


AGG 


AGA 


Gly 


Thr 


Arg 


Arg 




430 






TAC 


TGG 


CAA 


GGG 


Tyr 


Trp 


Gin Gly 




445 






GGT 


TTA 


CTC 


AGC 


Gly 


Leu 


Leu 


Ser 




460 






AGG 


TTG 


GAT 


ATT 


Arg 


Leu 


Asp 


He 




475 






ATT 


GGC 


AAT 


CTA 


He 


Gly 


Asn 


Leu 




490 






ATC 


ATT 


TTC 


TTC 


CAA 


TTG 


CAA 


AAG 


CTA 


CTA 


TTG 


GAA 


ATG 


GAA 


TGG 


ATT 



CAT GAA GAA 17 98 
His Glu Glu 
195 

ATA AGA GGT 184 3 
He Arg Gly 
210 

TTT ATA GAG 188 8 
Phe He Glu 
225 

CAT GGG GTT 1933 
His Gly Va.1 
240 

AAA ATG GTG 1978 
Lys Met Val 
255 

CCA TAT TTA 2023 
Pro Tyr Leu 
270 

GCA ATG CCA 2 068 
Ala Met Pro 
285 

TTG GTG AGT 2113 
Leu Val Ser 
300 

ATG GTG GCA 2158 
Met Val Ala 
315 

ATT GAG GAA 22 03 
He Glu Glu 
330 

CGG ATT CCT 224 8 
Arg He Pro 
345 

GTT CAT CCA 22 93 
Val His Pro 
360 

GCA CCA GTA 23 3 8 
Ala Pro Val 
375 

AGA ATG ATA 2383 
Arg Met He 
390 

TTG TGG CCT 2428 
Leu Trp Pro 
405 

GGG ATG GAG 2473 
Gly. Met Glu 
420 

TTG TTT GAC 2518 
Leu Phe Asp 
435 

TTC CTT TCT 2 563 
Phe Leu Ser 
450 

TTG TGT CTT 2608 
Leu Cys Leu 
465 

GTT ACA AAA 2653 
Val Thr Lys' 
480 

GCA ATA GAG 2698 
Ala He Glu 
495 

ATT TTA ATT 274 3 



TGA GAT AAG 278 8 
AGT TAA AAT 2 83 3 
TTG TAA A 2 876 



(2) 



INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS : 



WO 00/08920 



10 



<A) LENGTH: 3265 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
ATC TCA TTG TAT AGC TTG TCT TTT GTT TCA GTC GTC TTA GGC TTG 45 
GGT TAG TTG GTG TTG CTG TTT CAT ACT TCT ATC AAC CTT GTG TGA 90 
GTT CCT TTA TAA AAT ATG ACT GTT GGA GGA AGT AAT TTA CCT TTA 135 
GTT CGA CTA CAT CAA GAT TTG CAT CAT TCT CGT CCA AGA AAT CTT 180 
AGT TTG AAG CCT TTT GGT CTG GTA TAT TTG TCA ATC TGA GCT TCG 225 
CAA CTT TCT CAT GAC AGG GGT TTG TTG ACA TGC CTG ATT GTG CTC 270 
TTC CTT TAC TTG ATA ATT GCT GCT TGT TGC GGA GGC ATC ACT CTA 315 
CCT TCC TGC AGA TCA TGA ATT CTC TGA AAA GGA GCA CCA TAT TTG 36 0 
CCG CAC TGT GGT TCA TAT TTC CAA TTA CAT TTA GAT GAA CTA TAT 405 
CAT CAG GAG TGA AAG GTT ATT GTA TAA TCA ATC CAG TGG ATT CTC 4 50 
GTT CTG GCA CCT TTA GAA GTA CAT GTG CGG AAA AGA ATG ATA AGG 4 95 
TTT GTA TTG TTG TTG ACA AGG CCT GTT GCC TTT CTC ATT TGT AAA 54 0 
TGT TCT GAA CGA CTC CTA AAT TAC TCT TAA AGT GTA AGG TCT TCC 585 
GTG CCT GTT TGT ATA TAT AAT GCT GTG CCG TGA CTT ACC TTT TGT 63 0 
ACC ATT TGT TCA AAT GTA TGG CCT GGA CAC TAG GGT TGT CAA AAA 675 
TGT CTC ATG ACT TCA CCC TTC TTT CTT GTC TTG GTG CCC GTT TTA 720 
TTG GTC TGA GAA CGG CGT GAT GCC AAA TTC TGC CGC TCC ACA GTG 765 
AGC ATT TCG ATC TAC TGG AAA TTG ACC AAC TTA TTT TAT CAC TTG 810 
ATA ACT AGA GTC TGG GTT CAA ACA AAA TCC AAT AAC TTC AAT CAT 855 
ACA TTG TAT TTA TAT TGA AAA AAT TAT GCA CAA CTC AGT AAA TTA 900 
CCT TTT TTT GCA GTC AAA AAT TCT AGA TCA GTT TCT AAT TAA TCA 94 5 
AAA TGG CCT TTA TAG GGT CCC AGT TCC ATT AAT ATA CCT GCC GTC 990 
CAT GCA CTG ATT ACA AGA CAA ATA CCT CAC TAT GTT TGT TAG TGC 103 5 
TTG GTA ATA TAA AAC CTT TTC TTT TAT GAG AAA GTT CAC CGA AAA 1080 
TAA TTT TCT ATT TGT GGC ATA ACT AGT ATC GAA GTA TAT AGT GCA 1125 
GAT TGA CAA GAA TTT AAT TTT GCA GTT GGG CAC ATG AAC AAT TTT 1170 
CCT CAA AGT TGT AGA AAA TAT TTT TCA TTT TCT TGT CAC CGA AAA 1215 
TTA TTT ATA ATT GAA ATT GAA ACC GAA TGA GCT GCA AGA CTC GAG 1260 
TCG AAT TTC AAA AAA ATT GAC CAA CTA AAT ATG AAA AAA TCC GAA 1305 
TAT ATC CCC CAC CCC CTA CCA AAC ACA TCC TAA AGT GAG GTA TAG 13 50 
ACT GGG ACT GGG ATT GGG AAA AGG GTA AAA TGC TTT CAC TAG CTT 13 95 
AGC AAA GAT TCC ACT TTG TTA GCT ATC TTT CTT TCT CAT TTC CTT 144 0 
TTT TCT TTT TCT TTT TTT TGT TAT ATA AGC CAA AGT AGG TAC CCA 14 85 
AAA GCA TCA ATA TTT TGT ATT GCT TGG TGA TTC CTC TTT ACT CCA 1530 
GTA TTT CAT TTT CTA CAA GTT CCA CCT CCC TCC ATA ATT AAC CAT 1575 
TAT CAA TCT TAT ACA TTT TCT ATA ATG GAA ACT CTT CTC AAG CCT 1620 

Met Glu Thr Leu Leu Lys Pro* 
5 

TTT CCA TCT CTT TTA CTT TCC TCT CCT ACA CCC TAT AGG TCT ATT 1665 
Phe Pro Ser Leu Leu Leu Ser Ser Pro Thr Pro Tyr Arg Ser He 

10 15 20 

GTC CAA CAA AAT CCT TCT TTT CTA AGT CCC ACC ACC CAA AAA AAA 1710 
Val Gin Gin Asn Pro Ser Phe Leu Ser Pro Thr Thr Gin Lys Lys 

25 30 35 

TCA AGA AAA TGT CTT CTT AGA AAC AAA AGT AGT AAA CTT TTT TGT 1755 
Ser Arg Lys Cys Leu Leu Arg Asn Lys Ser Ser Lys Leu Phe Cys 

40 45 50 

AGC TTT CTT GAT TTA GCA CCC ACA TCA AAG CCA GAG TCT TTA AAT 18 00 
Ser Phe Leu Asp Leu Ala Pro Thr Ser Lys Pro Glu Ser Leu Asn 

55 60 65 

GTT AAC ATC TCA TGG GTT GAT CCT AAT TCT GGT CGG GCT CAA TTC 184 5 
val Asn He Ser Trp Val Asp Pro Asn Ser Gly Arg Ala Gin Phe 

70 75 80 

GAC GTG ATC ATT ATC GGA GCT GGC CCT GCT GGG CTC AGG TTA GCT 1890 
Asp Val He He He Gly Ala Gly Pro Ala Gly Leu Arg Leu Ala 

85 90 95 

GAA CAA GTT TCT AAA TAT GGT ATT AAG GTA TGT TGT GTT GAC CCT 1935 
Glu Gin Val Ser Lys Tyr Gly He Lys Val Cys Cys Val Asp Pro 

100 105 HO 

TCA CCA CTC TCC ATG TGG CCA AAT AAT TAT GGT GTT TGG GTT GAT 1980 
Ser Pro Leu Ser Met Trp Pro Asn Asn Tyr Gly Val Trp Val Asp 

115 120 125 

GAG TTT GAG AAT TTA GGA CTG GAA GAT TGT TTA GAT CAT AAA TGG 2025 
Glu Phe Glu Asn Leu Gly Leu Glu Asp Cys Leu Asp His Lys Trp 

130 135 140 

CCT ATG ACT TGT GTG CAT ATA AAT GAT. AAC AAG ACT AAG TAT TTG 2 070 
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Pro. 


Met 


Thr 


Cys 


Val 


His 


He 


Asn 


Asp 


Asn 


Lys 


Thr 


Lys 


Tyr 


Leu 








14 5 










150 










155 








wort 


AGA 




TAT 


GGT 


AGA 


GTT 


AGT 


AGA 


AAG 


AAG 


CTG 


AAG 


TTG 


AAA 


2115 


Gly 


Arg 


Pro 


Tyr 


Gly 


Arg 


Val 


Ser 


Arg 


Lys 


Lys 


Leu 


Lys 


Leu 


Lys 








160 










165 










170 








lib 


TTP 




Mlj 1 


TGT 


GTT 


GAG 


AAC 


AGA 


GTG 


AAG 


TTT 


TAT 


AAA 


• 

GCT 


2160 


Leu 


Leu 


Asn 


Ser 


Cys 


Val 


Glu 


Asn 


Arg 


Val 


Lys 


Phe 


Tyr 


Lys 


Ala 








175 










180 










185 








AAG 


GTT 


TGG 


AAA 


GTG 


GAA 


CAT 


GAA 


GAA 


TTT 


GAG 


TCT 


TCA 


ATT 


GTT 


2205 


Lys 


Val 


Trp 


Lys 


Val 


Glu 


His 


Glu 


Glu 


Phe 


Glu 


Ser 


Ser 


He 


Val 








190 










195 


• 








200 








TGT 


GAT 








AAG 


ATA 


AGA 


GGT 


AGT 


TTG 


GTT 


GTG 


GAT 


GCA 


22 50 


Cys 


Asp 


Asp 


Gly 


Lys 


Lys 


He 


Arg 


Gly 


Ser 


Leu 


Val 


Val 


Asp 


Ala 








205 










210 










215 








AGT 


GGT 


TTT 


GCT 


AGT 


GAT 


TTT 


ATA 


GAG 


TAT 


GAC 


AAG 


CCA 


AGA 


AAC 


22 95 


Ser 


Gly 


Phe 


Ala 


Ser 


Asp 


Phe 


He 


Glu 


Tyr 


Asp 


Lys 


Pro 


Arg 


Asn 








220 










225 










230 








CAT 


GGT 


TAT 


CAA 


ATT 


GCT 


CAT 


GGG 


GTT 


TTA 


GTA 


GAA 


GTT 


GAT 


AAT 


234 0 


His 


Gly 


Tyr 


Gin 


lie 


Ala 


His 


Gly 


Val 


Leu 


Val 


Glu 


Val 


Asp 


Asn 








235 










240 










245 








CAT 


CCA 


TTT 


GAT 


TTG 


GAT 


AAA 


ATG 


GTG 


CTT 


ATG 


GAT 


TGG AGG 


GAT 


2385 


His 


Pro 


Phe 


Asp 


Leu 


Asp 


Lys 


Met 


Val 


Leu 


Met 


Asp 


Trp 


Arg 


Asp 








250 










255 










260 








TCT 


CAT 


TTA 


GGT 


AAT 


GAG 


CCA 


TAT 


TTA 


AGG 


GTG 


AAT 


AAT 


GCT 


AAA 


2430 


Ser 


His 


Leu 


Gly 


Asn 


Glu 


Pro 


Tyr 


Leu 


Arg 


Val 


Asn 


Asn 


Ala 


Lys 








265 










270 










275 








GAA 


CCA 


ACA 


TTC 


TTG 


TAT 


GCA 


ATG 


CCA 


TTT 


GAT 


AGA 


AAT 


TTG 


GTT 


2475 


Glu 


Pro 


Thr 


Phe 


Leu 


Tyr 


Ala 


Met 


Pro 


Phe 


Asp 


Arg 


Asn 


Leu 


Val 








280 










285 










290 








TTC 


TTG 


GAA 


GAG 


ACT 


TCT 


TTG 


GTG 


AGT 


CGT 


CCT 


GTG 


TTA 


TCG 


TAT 


2520 


Phe 


Leu 


Glu 


Glu 


Thr 


Ser 


Leu 


val 


Ser 


Arg 


Pro 


val 


Leu 


Ser 


Tyr 








295 










300 










305 






ATG 


GAA 


GTA 


AAA 


AGA 


AGG 


ATG 


GTG 


GCA 


AGA 


TTA 


AGG 


CAT 


TTG 


GGG 


2565 


Met 


Glu 


val 


Lys 


Arg 


Arg 


Met 


Val 


Ala 


Arg 


Leu 


Arg 


His 


Leu 


Gly 








310 










315 










320 








ATC 


AAA 


GTG 


AGA 


AGT 


GTT 


ATT 


GAG 


GAA 


GAG 


AAA 


TGT 


GTG 


ATC 


CCT 


2610 


lie 


Lys 


Val 


Arg 


Ser 


Val 


He 


Glu 


Glu 


Glu 


Lys 


Cys 


Val 


He 


Pro 








325 










330 










335 








ATG 


GGA 


GGA 


CCA 


CTT 


CCG 


CGG 


ATT 


CCT 


CAA 


AAT 


GTT 


ATG 


GCT 


ATT 


2655 


Met 


Gly 


Gly 


Pro 


Leu 


Pro 


Arg 


He 


Pro 


Gin 


Asn 


Val 


Met 


Ala 


lie 








340 










34 5 










350 








GGT 


GGG 


AAT 


TCA 


GGG 


ATA 


GTT 


CAT 


CCA 


TCA 


ACG 


GGG 


TAC 


ATG 


GTG 


2700 


Gly 


Gly 


Asn 


Ser 


Gly 


He 


Val 


His 


Pro 


Ser 


Thr 


Gly Tyr Met 


Val 








355 










360 










365 








GCT 


AGG 


AGC 


ATG 


GCT 


TTA 


GCA 


CCA 


GTA 


CTA 


GCT 


GAA 


GCC 


ATC 


GTC 


274 5 


Ala 


Arg 


Ser 


Met 


Ala 


Leu 


Ala 


Pro 


Val 


Leu 


Ala 


Glu 


Ala 


He 


Val 








370 










375 










380 








GAG 


GGG 


CTT 


GGC 


TCA 


ACA 


AGA 


ATG 


ATA 


AGA 


GGG 


TCT 


CAA 


CTT 


TAC 


2790 


Glu 


Gly 


Leu 


Gly 


Ser 


Thr 


Arg 


Met 


He 


Arg 


Gly 


Ser 


Gin 


Leu 


Tyr 








385 










390 










395 








CAT 


AGA 


GTT 


TGG 


AAT 


GGT 


TTG 


TGG 


CCT 


TTG 


GAT 


AGA 


AGA 


TGT 


GTT 


2835 


His 


Arg 


Val 


Trp 


Asn 


Gly 


Leu 


Trp 


Pro 


Leu 


Asp 


Arg Arg 


Cys 


Val 








400 










405 










410 








AGA 


GAA 


TGT 


TAT 


TCA 


TTT 


GGG 


ATG 


GAG 


ACA 


TTG 


TTG 


AAG 


CTT 


GAT 


2880 


Arg 


Glu 


Cys 


Tyr 


Ser 


Phe 


Gly 


Met 


Glu 


Thr 


Leu 


Leu 


Lys 


Leu 


Asp 








415 










420 










425 








TTG 


AAA 


GGG 


ACT 


AGG 


AGA 


TTG 


TTT 


GAC 


GCT 


TTC 


TTT 


GAT 


CTT 


GAT 


2925 


Leu 


Lys 


Gly 


Thr 


Arg 


Arg 


Leu 


Phe 


Asp 


Ala 


Phe 


Phe 


Asp 


Leu 


Asp 








430 










435 










440 








CCT 


AAA 


TAC 


TGG 


CAA 


GGG 


TTC 


CTT 


TCT 


TCA 


AGA 


TTG 


TCT 


GTC 


AAA 


2970 


Pro 


Lys 


Tyr 


Trp 


Gin 


Gly 


Phe 


Leu 


Ser 


Ser 


Arg 


Leu 


Ser 


Val 


Lys 








445 










450 










455 








GAA 


CTT 


GGT 


TTA 


CTC 


AGC 


TTG 


TGT 


CTT 


TTC 


GGA 


CAT 


GGC 


TCA 


AAT 


301S 


Glu 


Leu 


Gly 


Leu 


Leu 


Ser 


Leu 


Cys 


Leu 


Phe 


Gly 


His Gly Ser 


Asn 








460 










465 










470 








TTG 


ACT 


AGG 


TTG 


GAT 


ATT 


GTT 


ACA 


AAA 


TGT 


CCT 


GTT 


CCT 


TTG 


GTT 


3060 


Leu 


Thr 


Arg 


Leu 


Asp 


He 


Val 


Thr 


Lys 


Cys 


Pro 


Val 


Pro 


Leu 


Val 








475 










480 










485 








AGA 


CTG 


ATT 


GGC 


AAT 


CTA 


GCA 


GTA 


GAG 


AGC 


CTT 


TGA 


ATG 


TGA 


AAA 


3105 


Arg 


Leu 


He 


Gly 


Asn 


Leu 


Ala 


Val 


Glu 


Ser 


Leu 
















490 










495 






498 












GTT 


TGA 


ATC 


ATT 


TTC 


TTT 


ATT 


TTA 


ATT 


TCT 


TTG 


ATT 


ATT 


TTC 


ATA 


3150 


TTT 


TCT 


CAA 


TGC 


AAA 


AGT 


GAG 


AGA 


AGA 


CTA 


TAC 


ACT 


GTC 


AAC 


AAA 


3195 


TAA 


ACT 


ACT 


ATT 


GGA 


AAG 


TTA 


AAA 


TAA 


TGT 


GTG 


TGT 


TGT 


ATG 


TTA 


3240 


TGC 


TAA 


TGG 


AAT 


GGA 


TTG 


GTG 


TAA 


A 














3265 
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{2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 174 0 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

ATG GAA GCT CTT CTC AAG CCT TTT CCA TCT CTT TTA CTT TCC TCT 4 5 
Met Glu Ala Leu Leu Lys Pro Phe Pro Ser Leu Leu Leu Ser Ser 
5 10 15 

CCT ACA CCC TAT AGG TCT ATT GTC CAA CAA AAT CCT TCT TTT CTA 90 
Pro Thr Pro Tyr Arg Ser He Val Gin Gin Asn Pro Ser Phe Leu 

20 25 30 

AGT CCC ACC ACC AAA AAA AAT CAA GAA AAT GTC TTC TTA GAA ACA 135 
Ser Pro Thr Thr Lys Lys Asn Gin Glu Asn Val Phe Leu Glu Thr 

35 40 45 

AAA GTA GTA AAC TTT TTT GTA GCT TTC TTG ATT TAG CAC CCA CAT 18 0 
Lys Val Val Asn Phe Phe Val Ala Phe Leu He 

50 55 56 

CAA AGC CAG AGT CTT TAA ATG TTA ACA TCT CAT GGG TTG ATC CTA 225 
ATT CGA ATC GGG CTC AAT TCG ACG TGA TCA TTA TCG GAG CTG GCC 270 
CTG CTG GGC TCA GGC TAG CTG AAC AAG TTT CTA AAT ATG GTA TTA 315 
AGG TAT GTT GTG TTG ACC CTT CAC CAC TCT CCA TGT GGC CAA ATA 360 
ATT ATG GTG TTT GGG TTG ATG AGT TTG AGA ATT TAG GAC TGG AAA 4 05 
ATT GTT TAG ATC ATA AAT GGC CTA TGA CTT GTG TGC ATA TAA ATG 4 50 
ATA ACA AAA CTA AGT ATT TGG GAA GAC CAT ATG GTA GAG TTA GTA 4 95 
GAA AGA AGC TGA AGT TGA AAT TGT TGA ATA GTT GTG TTG AGA ACA 54 0 
GAG TGA AGT TTT ATA AAG CTA AGG TTT GGA AAG TGG AAC ATG AAG 585 
AAT TTG AGT CTT CAA TTG TTT GTG ATG ATG GTA AGA AGA TAA GAG 63 0 
GTA GTT TGG TTG TGG ATG CAA GTG GTT TTG CTA GTG ATT TTA TAG 675 
AGT ATG ACA GGC CAA GAA ACC ATG GTT ATC AAA TTG CTC ATG GGG 720 
TTT TAG TAG AAG TTG ATA ATC ATC CAT TTG ATT TGG ATA AAA TGG 765 
TGC TTA TGG ATT GGA GGG ATT CTC ATT TGG GTA ATG AGC CAT ATT 810 
TAA GGG TGA ATA ATG" CTA AAG AAC CAA CAT TCT TGT ATG CAA TGC 855 
CAT TTG ATA GAG ATT TGG TTT TCT TGG AAG AGA CTT CTT TGG TGA 900 
GTC GTC CTG TGT TAT CGT ATA TGG AAG TAA AAA GAA GGA TGG TGG 94 5 
CAA GAT TAA GGC ATT TGG GGA TCA AAG TGA AAA GTG TTA TTG AGG 990 
AAG AGA AAT GTG TGA TCC CTA TGG GAG GAC CAC TTC CGC GGA TTC 103 5 
CTC AAA ATG TTA TGG CTA TTG GTG GGA ATT CAG GGA TAG TTC ATC 1080 
CAT CAA CAG GGT ACA TGG TGG CTA GGA GCA TGG CTT TAG CAC CAG 1125 
TAC TAG CTG AAG CCA TCG TCG AGG GGC TTG GCT CAA CAA GAA TGA 1170 
TAA GAG GGT CTC AAC TTT ACC ATA GAG TTT GGA ATG GTT TGT GGC 1215 
CTT TGG ATA GAA GAT GTG TTA GAG AAT GTT ATT CAT TTG GGA TGG 1260 
AGA CAT TGT TGA AGC TTG ATT TGA AAG GGA CTA GGA GAT TGT TTG 1305 
ACG CTT TCT TTG ATC TTG ATC CTA AAT ACT GGC AAG GGT TCC TTT 1350 
CTT CAA GAT TGT CTG TCA AAG AAA CTT GGT TTA CTC AGC TTG TGT 1395 
CTT TTC GGA CAT GGC TCA AAC ATG ACT AGG TTG GGA TAT TGT TAC 1440 
AAA ATG TCC TCT TCC TTT GGT TAG ACT GAT TGG CAA TCT AGC AAT 14 85 
AGA GAG CCT TTG AAA TGT GAA AAG TTT GAA TCA TTT TCT TCA TTT 1530 
TAA TTT CTT TGA TTA TTT TCA TAT TTT CTC AAT TGC AGA ATG AGA 1575 
TAA AAA CTA CAT ACT GTC GAC AAA TAA ACT ACT ATT GGA ANG TTA 162 0 
AAA TAA TGT GTG TGT TGN ATG TTA NGC CTA ATG GAA NGG ATG NGG 1665 
TTA NGC AAT TTA TGA ACT GNN CGC TCT GTT CGC TTA AAA NCC TTG 1710 
GTT CCA CCT TAA NGG AAN GGN CCG GCC ATT 1740 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 98 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 







(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ ID 


NO: 


17: 






Met 


Glu 


Ala 


Leu 


Leu 


Lys 


Pro 


Phe 


Pro 


Ser 


Leu 


Leu 


Leu 


Ser 


Ser 










5 










10 










15 


Pro 


Thr 


Pro 


His 


Arg 


Ser 


He 


Phe 


Gin 


Gin 


Asn 


Pro 


Ser 


Phe 


Leu 










20 










25 










30 


Ser 


Pro 


Thr 


Thr 


Lys 


Lys 


Lys 


Ser 


Arg 


Lys 


Cys 


Leu 


Leu 


Arg 


Asn 










35 










40 










45 


Lys 


Ser 


Ser 


Lys 


Leu 


Phe 


Cys 


Ser 


Phe 


Leu 


Asp 


Leu 


Ala 


Pro 


Thr 



WO 00/08920 
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50 






Set- 


Lys 


Pro 


Glu 


Ser 
65 


Leu 


Asp 


Asn 


Ser 


Asn 


Arg 


Ala Gin 


Phe 










80 






Pro 


Ala 


Gly Leu 


Arg 


Leu 


Ala 










95 






Lys 


Val 


Cys 


Cys 


Val 
110 


Asp 


Pro 


Asn 


Tyr 


Gly Val 


Trp Val 


Asp 










125 






Asn 


Cys 


Leu 


Asp 


His 
140 


Lys 


Trp 


Asp 


Asn 


Lys 


Thr 


Lys 
155 


Tyr 


Leu 


Arg 


Lys 


Lys 


Leu 


Lys 
170 


Leu 


Lys 


Arg 


Val 


Lys 


Phe 


Tyr 
185 


Lys 


Ala 


Glu 


Phe 


Glu 


Ser 


Ser 
200 


He 


Val 


Gly Ser 


Leu 


Val 


Val 


Asp 


Ala 










215 






Glu 


Tyr 


Asp 


Arg 


Pro 
230 


Arg 


Asn 


Val 


Leu 


Val 


Glu 


Val 
245 


Asp 


Asn 


Val 


Leu 


Met 


Asp 


Trp 
260 


Arg 


Asp 


Leu 


Arg 


val 


Asn 


Asn 
275 


Ala 


Lys 


Pro 


Phe 


Asp 


Arg 


Asp 
290 


Leu 


Val 


Ser 


Arg 


Pro 


val 


Leu 
305 


Ser 


Tyr 


Ala 


Arg 


Leu 


Arg 


His 
320 


Leu 


Gly 


Glu 


Glu 


Lys 


Cys 


Val 
335 


He 


Pro 


Pro 


Gin 


Asn 


Val 


Met 
350 


Ala 


He 


Pro 


Ser 


Thr 


Gly 


Tyr 
365 


Met 


Val 


Val 


Leu 


Ala 


Glu 


Ala 
380 


lie 


Val 


lie 


Arg 


Gly 


Ser 


Gin 
395 


Leu 


Xyr 


Pro 


Leu 


Asp 


Arg 


Arg 
410 


Cys 


Val 


Glu 


Thr 


Leu 


Leu 


Lys 
425 


Leu 


Asp 


Asp Ala 


Phe 


Phe 


Asp 


Leu 


Asp 










440 






Ser 


Ser 


Arg 


Leu 


Ser 
455 


val 


Lys 


Leu 


Phe 


Gly 


His 


Gly 
470 


Ser 


Asn 


Lys 


Cys 


Pro 


Leu 


Pro 
485 


Leu 


Val 


Glu 


Ser 


Leu 
498 















55 








60 


Val 


Asn 


He 


Ser Trp 


Val 


Asp 


Pro 






70 








76 


Asp 


Val 


He 


He He Gly 


Ala Gly 






85 








90 


Glu 


Gin 


Val 


Ser Lys 


Tyr 


Gly 


He 






100 








105 


Ser 


Pro 


Leu 


Ser Met 


Trp 


Pro 


Asn 






115 








120 


Glu 


Phe 


Glu 


Asn Leu 


Gly 


Leu 


Glu 






130 








135 


Pro 


Met 


Thr 


Cys Val 


His 


He 


Asn 






145 








150 


Gly 


Arg 


Pro 


Tyr Gly Arg 


Val 


Ser 






160 








165 


Leu 


Leu 


Asn 


Ser Cys 


Val 


Glu 


Asn 






175 








180 


Lys 


Val 


Trp 


Lys val 


Glu 


His 


Glu 






190 








195 


Cys 


Asp Asp 


Gly Lys 


Lys 


He 


Arg 






205 








210 


Ser 


Gly Phe 


Ala Ser 


Asp 


Phe 


He 






220 








225 


His 


Gly 


Tyr 


Gin He 


Ala 


His 


Gly 






235 








240 


His 


Pro 


Phe 


Asp Leu 


Asp 


Lys 


Met 






250 








255 


Ser 


His 


Leu 


Gly Asn 


Glu 


Pro 


Tyr 






265 








270 


Glu 


Pro 


Thr 


Phe Leu 


Tyr 


Ala 


Met 






280 








285 


Phe 


Leu 


Glu 


Glu Thr 


Ser 


Leu 


Val 






295 








3 00 


Met 


Glu 


Val 


Lys Arg 


Arg 


Met 


Val 






310 








315 


He 


Lys 


Val 


Lys Ser 


Val 


He 


Glu 






325 








330 


Met 


Gly 


Gly 


Pro Leu 


Pro 


Arg 


He 






340 








34S 


Gly 


Gly Asn 


Ser Gly 


He 


Val 


His 






355 








360 


Ala 


Arg 


Ser 


Met Ala 


Leu 


Ala 


Pro 






370 








375 


Glu 


Gly Leu 


Gly Ser Thr 


Arg 


Met 






385 








390 


His 


Arg 


val 


Trp Asn Gly 


Leu 


Trp 






400 








405 


Arg 


Glu 


Cys 


Tyr Ser 


Phe 


Gly Met 






415 








420 


Leu 


Lys 


Gly 


Thr Arg Arg 


Leu 


Phe 






430 








435 


Pro 


Lys 


Tyr 


Trp Gin Gly 


Phe 


Leu 






445 








450 


Glu 


Leu 


Gly 


Leu Leu 


Ser 


Leu 


Cys 






460 








465 


Met 


Thr 


Arg 


Leu Asp 


He 


Val 


Thr 






475 








480 


Arg 


Leu 


He 


Gly Asn 


Leu 


Ala 


He 






490 








495 



(2) 



INFORMATION FOR SEQ ID NO: 18: 



<i> 



(xi) 



SEQUENCE CHARACTERISTICS: 



(A) 
(B) 
(C) 
(D) 



LENGTH : 
TYPE: 

STRANDEDNESS : 
TOPOLOGY : . 



498 

amino acid 

single 

linear 

SEQ ID NO: 18: 



SEQUENCE DESCRIPTION: 

Met Glu Ala Leu Leu Lys Pro Phe Pro Ser Leu Leu Leu Ser Ser 
5 10 15 

Pro Thr Pro His Arg Ser He Phe Gin Gin Asn Pro Ser Phe Leu 
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20 










25 






30 


Ser 


Piro 




Thr 


35 


LVS 


Lvs 


Ser 




Lys 

40 


Cys 


Leu Leu Arg 


Asn 
45 


Lys 


Ser 


Ser 


Lvs 


Leu 


Phe 


Cvs 


Ser 


Phe 


Leu 


Asp 


Leu Ala Pro 


Thr 








50 










55 






60 


Ser 


Lys 


Pro 


Glu 


Ser 


Leu 


Asp 


Val 


Asn 


He 


Ser 


Trp Val Asp 


Pro 








65 










70 






76 


Asn 


Ser 


Asn 


Arg 


Ala 
80 


Gin 


Phe 


Asp 


val 


He 
85 


He 


He Gly Ala 


Gly 
90 


Pro 


Ala 


Gly 


Leu 




Leu 


Ala 


Glu 


Gin 


Val 


Ser 


Lys Tyr Gly 


He 








95 










100 






105 


Lys 


Val 


Cys 


Cys 


val 


A fin 


Pro 


Ser 


Pro 


Leu 


Ser 


Met Trp Pro 


Asn 






110 










115 






120 


Asn Tyr Gly 


Val 


Trp 


Val 


Asp 


Glu 


Phe 


Glu Asn Leu Gly Leu Glu 










125 










130 






Ub 


Asn 


Cys 


Leu 


sp 


His 


Lvs 


Trp 


Pro 


Met 


Thr 


Cys 


Val His He 


Asn 








140 










145 








Asp 


Asn 


Lys 


Thr 




iyr 


Leu Gly Arg 


Pro Tyr Gly Arg Val 


Ser 








155 










160 






165 


Arg 


Lys 


Lys 








Lys 


Leu 


Leu 


Asn 


Ser 


Cys Val Glu 


Asn 








170 










175 






180 


Arg 


Val 


Lys 


Phe 


Tyr 


Lvs 


Ala 


Lys 


Val 


Trp 


Lys 


Val Glu His 


Glu 








185 










190 






195 


Glu 


Phe 


Glu 


ser 


200 


He 


Val 


Cys 


Asp 


Asp 
205 


Gly 


Lys Lys He 


Arg 
210 


Gly 


Ser 


Leu 


VaJ. 


vai 


Asp 


Ala 


Ser Gly 


Phe 


Ala 


Ser Asp Phe 


I le 








215 










220 






225 


Glu 


Tyr 


Asp 


Arg 


Pro 


Arg 


Asn 


His 


Gly 


Tyr 


Gin 


He Ala His 


Gly 








230 










235 






24 0 


Val 


Leu 


val 


Glu 


Vdi 

24 5 


Asp 


Asn 


His 


Pro 


Phe 
250 


Asp 


Leu Asp Lys 


Met 

255" 


Val 


Leu 


Met 


sp 


Trp 


Arg 


Asp 


Ser 


His 


Leu Gly Asn Glu Pro 


Tyr 










o a n 

O U 










265 






27 0 


Leu 


Arg 


Val 


Asn 


Asn 


Ala 


Lys 


Glu 


Pro 


Thr 


Phe 


Leu Tyr Ala 


Met 








275 










280 






285 


Pro 


Phe 


Asp 


Arg 


Asp 


Leu 


Val 


Phe 


Leu 


Glu 


Glu 


Thr Ser Leu 


Val 








290 










295 






300 


Ser 


Arg 


Pro 


val 


Leu 


Ser 


Tyr 


Met' 


Glu 


Val 


Lys 


Arg Arg Met 


Vdi 








305 










310 






315 


Ala 


Arg 


Leu 


Arg 


His 


Leu Gly 


He 


Lys 


Val 


Lys 


Ser Val He 


Glu 








320 










325 






330 


Glu 


Glu 


Lys 


Cys 


Val 


He 


Pro 


Met 


Gly Gly 


Pro 


Leu Pro Arg 


He 










335 










340 






345 


Pro 


Gin 


Asn 


Val 


Met 


Ala 


He 


Gly Gly Asn 


Ser 


Gly He Val 


His 










350 










355 






360 


Pro 


Ser 


Thr 


Gly Tyr 


Met 


Val 


Ala 


Arg 


Ser 


Met 


Ala Leu Ala 


Pro 










36 5 










370 






375 


Val 


Leu 


Ala 


Glu 


Ala 


He 


Val 


Glu Gly 


Leu Gly 


Ser Thr Arg 


Met 










380 










385 






390 


He 


Arg 


Gly 


Ser 


Gin 


Leu 


Tyr 


His 


Arg 


Val Trp Asn Gly Leu 


Trp 






395 










400 






405 


Pro 


Leu 


Asp 


Arg 


Arg 
410 


Cys 


Val 


Arg 


Glu 


Cys 
415 


Tyr 


Ser Phe Gly 


Met 
420 


Glu 


Thr 


Leu 


Leu 


Lys 


Leu 


Asp 


Leu 


Lys 


Gly 


Thr Arg Arg Leu 


Phe 










425 










430 






435' 


Asp 


Ala 


Phe 


Phe 


Asp 


Leu 


Asp 


Pro 


Lys 


Tyr 


Trp Gin Gly Phe 


Leu 








440 










445 






450 


Ser 


Ser 


Arg 


Leu 


Ser 


Val 


Lys 


Glu 


Leu 


Gly 


Leu 


Leu Ser Leu 


Cys 








455 










460 






465 


Leu 


Phe 


Gly 


His 


Gly 


Ser 


Asn 


Met 


Thr 


Arg 


Leu 


Asp He Val 


Thr 








470 










475 






480 


Lys 


Cys 


Pro 


Leu 


Pro 


Leu 


Val 


Arg 


Leu 


He Gly Asn Leu Ala 


He 






485 










490 






495 



Glu Ser Leu 
498 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 4 98 

(B) TYPE: amino acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 



WO 00/08920 



15 



PCT/US99/1832? 







(xi> 


SEQUENCE 


DESCRIPTION : 


SEQ ID 


NO: 


19 : 






Met 


Glu 


Thr 


Leu 


Leu 


Lys 


Pro 


Phe 


Pro 


Ser Leu 


Leu 


Leu 


Ser 


Ser 










5 










10 








15 


Pro 


Thr 


Pro 


Tyr 


Arg 


Ser 


He 


Val 


Gin 


Gin Asn 


Pro 


Ser 


Phe 


Leu 










20 










25 








30 


Ser 


Pro 


Thr 


Thr 


Gin 


Lys 


Lys 


Ser 


Arg 


Lys Cys 


Leu 


Leu 


Arg 


Asn 










35 










40 








45 


Lys 


Ser 


Ser 


Lys 


Leu 


Phe 


Cys 


Ser 


Phe 


Leu Asp 


Leu 


Ala 


Pro 


Thr 










50 










55 








60 


Ser 


Lys 


Pro 


Glu 


Ser 


Leu 


Asn 


Val 


Asn 


He Ser 


Trp 


Val 


Asp 


Pro 










65 










70 








76 


Asn 


Ser 


Gly 


Arg 


Ala 


Gin 


Phe 


Asp 


Val 


He He 


He 


Gly Ala 


Gly 










80 










85 








90 


Pro 


Ala 


Gly 


Leu 


Arg 


Leu 


Ala 


Glu 


Gin 


Val Ser 


Lys 


Tyr Gly 


He 










95 










100 








105 


Lvs 


Val 


Cys 


Cys 


Val 


Asp 


Pro 


Ser 


Pro 


Leu Ser 


Met 


irp 


Pro 


Asn 










110 










115 








120 


Asn 


Tvr 


Glv 


Val 


Tro 


Val 


Asp 


Glu 


Phe 


Glu Asn 


Leu 


Gly 


Leu 


Glu 










125 










130 








135 


Asp 


Cys 


Leu. 


Asp 


His 


Lys 


Trp 


Pro 


Met 


Thr Cys 


Val 


His 


He 


Asn 










140 










145 








150 


Asp 


Asn 


Lys 


Thr 


Lys 


Tyr 


Leu 


Gly 


Arg 


Pro Tyr 


Gly 


Arg 


Val 


Ser 










155 










160 








165 


Arg 


Lys 


Lys 


Leu 


Lys 


Leu 


Lys 


Leu 


Leu 


Asn Ser 


Cys 


VSi 


Glu 


Asn 










170 










175 








180 


Ara 


Val 


Lvs 


Phe 


Tvr 


Lys 


Ala 


Lvs 

"Jr 


Val 


Trp Lys 


Val 


pi,. 
V> J.U 


His 


Glu 










185 










190 








195 


Glu 


Phe 


Glu 


Ser 


Ser 


lie 


Val 


Cvs 


Asd 


Asp Gly Lys 


Lys 


He 


Arg 










200 










205 








210 


Glv 


Ser 


Leu 


Val 


Val 


Asp 


Ala 


Ser 


Gly 


Phe Ala 


Ser 


Asp 


Phe 


He 










215 










220 








225 


Glu 


Tvr 


Aso 


Lvs 


Pro 


ArQ 


Asn 


His 


Gly 


Tyr Gin 


He 


Aid 


His 


Gly 










230 










235 








240 


Val 


Leu 


val 


Glu 


Val 




Asn 


His 


Pro 


Phe Asp 


Leu 


Asp 


Lys 


Met 










245 










250 








255 


Val 


Leu 


Met 


Asp 


Trn 


Ara 


Asn 


Ser 


His 


Leu Gly Asn 


Glu 


Pro Tyr 










260 










265 








270 


Leu 


Ara 


Val 


Asn 


Asn 


Ala 


Lys 


Glu 


Pro 


Thr Phe 


Leu 


Tyr 


Ala 


Met 










275 










280 








285- 


Pro 


Phe 


Asp 


Ara 


Asn 


Leu 


Val 


Phe 


Leu 


Glu Glu 


Thr 


Ser 


Leu 


Val 










290 










295 








300 


Ser 


Ara 


Pro 


Val 


Leu 


Ser 


Tvr 


Met 


Glu 


Val Lys 


Arg 


Arg 


Met 


Val 










305 










310 








315 


Ala 


Arg 


Leu 


Ara 


His 


Leu 


Glv 


He 


Lys 


Val Arg 


Ser 


Val 


He 


Glu 










320 










325 








330 


Glu 


Glu 


Lys 


Cvs 


Val 


lie 


Pro 


Met 


Glv 

j^ 


Gly Pro 


Leu 


Pro 


Arg 


He 










335 










340 








345 


Pro 


Gin 


Asn 


Val 


Met 


Ala 


He 


Glv 


Gly 


Asn Ser Gly 


He 


Val 


His 










350 










355 








360 


Pro 


Ser 


Thr 


Glv 


Tvr 


Met 


Val 


Ala 


Arg 


Ser Met 


Ala 


Leu 


Ala 


Pro 










365 










370 








375 


Val 


Leu 


Ala 


Glu 


Ala 


lie 


Val 


Glu 


Gly 


Leu Gly Ser 


Thr 


Arg 


Met 










380 










385 








390 


lie 


Ara 


Glv 


Ser 


Gin 


Leu 


Tvr 

* Jr *■ 


His 


Arg 


Val Trp Asn 


Gly 


Leu 


Trp 










395 










400 








405 


Pro 


Leu 


ASO 


Arg 


Arg 


Cys 


Val 


Arg 


Glu 


Cys Tyr 


Ser 


Phe 


Gly 


Met 










410 










415 








420 


Glu 


Thr 


Leu 


Leu 


Lys 


Leu 


Asp 


Leu 


Lys 


Gly Thr Arg 


Arg 


Leu 


Phe 










425 








430 






435 


Asp 


Ala 


Phe 


Phe 


Asp 


Leu 


Asp 


Pro 


Lys 


Tyr Trp 


Gin 


Gly 


Phe 


Leu 










440 










445 








450 


Ser 


Ser 


Arg 


Leu 


Ser 


Val 


Lys 


Glu 


Leu Gly Leu 


Leu 


Ser 


Leu 


Cys 










455 










460 








465 


Leu 


Phe 


Gly 


His 


Gly 


Ser 


Asn 


Leu 


Thr 


Arg Leu 


Asp 


He 


Val 


Thr 










470 










475 








480 


Lys 


Cys 


Pro 


Val 


Pro 


Leu 


Val 


Arg 


Leu 


He Gly 


Asn 


Leu 


Ala 


Val 








485 










90 








495 



Glu Ser Leu 
498 



(2) 



INFORMATION FOR SEQ ID NO: 20: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 56 , 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 







(xi) 


SEQUENCE 


DESCRIPTION: 


; SEQ ID 


NO: 


20: 






Met 


Glu 


Ala 


Leu 


Leu 


Lys 


Pro Phe Pro 


Ser 


Leu 


Leu 


Leu 


Ser 


Ser 










5 




10 










15 


Pro 


Thr 


Pro 


Tyr 


Arg 


Ser 


lie Val Gin 


Gin 


Asn 


Pro 


Ser 


Phe 


Leu 










20 






25 










30 


Ser 


Pro 


Thr 


Thr 


Lys 


Lys 


Asn Gin Glu 


Asn 


Val 


Phe 


Leu 


Glu 


Thr 










35 






40 










45 


Lys 


Val 


Val 


Asn 


Phe 


Phe 


Val Ala Phe 


Leu 


He 
















50 






55 













{2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 nucleic acids 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

TGACTTCACC CTTCTTTCTT GTCTTC 26 

(2) INFORMATION FOR SEQ ID NO:22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 nucleic acids 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

AGAGTCTGGG TTC 13 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 nucleic acids 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

CTAGTATCG 9 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 nucleic acids 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

CTAAATAT 8 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 nucleic acids 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 
AATTTTCAAA 10 
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